Nandaka / PixivUtil2

Download images from Pixiv and more!
http://nandaka.devnull.zone/
BSD 2-Clause "Simplified" License
2.36k stars 257 forks source link

Fanbox now uses Cloudflare #1342

Open biggestsonicfan opened 3 months ago

biggestsonicfan commented 3 months ago

Prerequisites

Description

All fanbox functions now create a AttributeError: 'str' object has no attribute 'decode' error due to new protections.

Steps to Reproduce

  1. Use any Fanbox functions

Expected behavior: Fanbox downloads

Actual behavior: Fanbox blocked

Log file: pixivutil.log

Versions

Latest git as of d0bfaaef0a992a4e8c5ee83a06fdf285bd272808 You can get this information from executing PixivUtil2.py --help. Latest version available in https://github.com/Nandaka/PixivUtil2/releases

shinji257 commented 3 months ago

I was just about to report this. I noticed this on my phone I think late last night where there was a captcha but wasn't sure why it was there then today discovered the downloader couldn't work because of it.

I enabled http debugging and this might provide more insight. This request gives an indicator that a CloudFlare challenge was given to be handled.

https://gist.github.com/shinji257/f0d3a4defa75037d1c61dd534a628750

(Reposted with new gist -- removed fanbox session id)

biggestsonicfan commented 3 months ago

Theoretically, this will probably just be temporary and resolve itself on it's own, going away after "whatever" it is that needed Cloudflare captchas is over.

shinji257 commented 3 months ago

Luckily this only impacts Fanbox. Downloads from Pixiv still work fine.

Phenrei commented 3 months ago

If this doesn't get removed, I have a local version now that is able to load the cf_clearance and __cf_bm cookies copied from a browser that has bypassed the check, which works as long as you mirror the user-agent of that browser in config.ini. I doubt those cookies would stay valid between download sessions for very long but at I was at least able to update rips of all my paid accounts through pixivutil once I copied them over.

emerladCoder commented 3 months ago

If this doesn't get removed, I have a local version now that is able to load the cf_clearance and __cf_bm cookies copied from a browser that has bypassed the check, which works as long as you mirror the user-agent of that browser in config.ini. I doubt those cookies would stay valid between download sessions for very long but at I was at least able to update rips of all my paid accounts through pixivutil once I copied them over.

Thanks for this work around. I was also able to download my supported fanbox users by adding those two cookies to my config and then using them in the fanboxLoginUsingCookie function in my local copy.

Ouachitajess commented 3 months ago

If you have a VPN you can set it for a Japanese IP and it will bypass Cloudflare captchas. I was able to download with no issues once I changed locations.

calboi91 commented 3 months ago

If this doesn't get removed, I have a local version now that is able to load the cf_clearance and __cf_bm cookies copied from a browser that has bypassed the check, which works as long as you mirror the user-agent of that browser in config.ini. I doubt those cookies would stay valid between download sessions for very long but at I was at least able to update rips of all my paid accounts through pixivutil once I copied them over.

Thanks for this work around. I was also able to download my supported fanbox users by adding those two cookies to my config and then using them in the fanboxLoginUsingCookie function in my local copy.

How do you add the other cookies to the config? Right now there is only the cookieFanbox value that is the FANBOXSESSID.

Phenrei commented 3 months ago

If this doesn't get removed, I have a local version now that is able to load the cf_clearance and __cf_bm cookies copied from a browser that has bypassed the check, which works as long as you mirror the user-agent of that browser in config.ini. I doubt those cookies would stay valid between download sessions for very long but at I was at least able to update rips of all my paid accounts through pixivutil once I copied them over.

Thanks for this work around. I was also able to download my supported fanbox users by adding those two cookies to my config and then using them in the fanboxLoginUsingCookie function in my local copy.

How do you add the other cookies to the config? Right now there is only the cookieFanbox value that is the FANBOXSESSID.

They're not currently supported/added by main branch, I had to add them in and also edit the cookie loading function to allow adding in a cookie value beyond the session id, as the current method resets the cookie jar on use and only sets that one cookie type. I just knew about these being a thing from my work with gallery-dl code so I knew that as long as you match those and the user-agent you can often bypass those captcha checks at least in the short term.

usernameisalreadytakendammit commented 3 months ago

If this doesn't get removed, I have a local version now that is able to load the cf_clearance and __cf_bm cookies copied from a browser that has bypassed the check, which works as long as you mirror the user-agent of that browser in config.ini. I doubt those cookies would stay valid between download sessions for very long but at I was at least able to update rips of all my paid accounts through pixivutil once I copied them over.

Thanks for this work around. I was also able to download my supported fanbox users by adding those two cookies to my config and then using them in the fanboxLoginUsingCookie function in my local copy.

How do you add the other cookies to the config? Right now there is only the cookieFanbox value that is the FANBOXSESSID.

They're not currently supported/added by main branch, I had to add them in and also edit the cookie loading function to allow adding in a cookie value beyond the session id, as the current method resets the cookie jar on use and only sets that one cookie type. I just knew about these being a thing from my work with gallery-dl code so I knew that as long as you match those and the user-agent you can often bypass those captcha checks at least in the short term.

how and where did you edit the cookie loading function?

emerladCoder commented 3 months ago

If this doesn't get removed, I have a local version now that is able to load the cf_clearance and __cf_bm cookies copied from a browser that has bypassed the check, which works as long as you mirror the user-agent of that browser in config.ini. I doubt those cookies would stay valid between download sessions for very long but at I was at least able to update rips of all my paid accounts through pixivutil once I copied them over.

Thanks for this work around. I was also able to download my supported fanbox users by adding those two cookies to my config and then using them in the fanboxLoginUsingCookie function in my local copy.

How do you add the other cookies to the config? Right now there is only the cookieFanbox value that is the FANBOXSESSID.

They're not currently supported/added by main branch, I had to add them in and also edit the cookie loading function to allow adding in a cookie value beyond the session id, as the current method resets the cookie jar on use and only sets that one cookie type. I just knew about these being a thing from my work with gallery-dl code so I knew that as long as you match those and the user-agent you can often bypass those captcha checks at least in the short term.

how and where did you edit the cookie loading function?

Is this still needed? I wasn't getting any Cloudflare when accessing from my location anymore.

For reference, the way I did it for my local copy is noted below. Pretty hacky IMO, but it was working when I was still getting Cloudflare for Fanbox. This only adds the cookies for Fanbox, not Pixiv.

  1. Add the below 2 entries under the Authentication header in the config.ini
    [Authentication]
    cf_clearance = <cf_clearance COOKIE FROM BROWSER>
    cf_bm = <__cf_bm COOKIE FROM BROWSER>
  2. Add the below two lines in the class PixivConfig() in the __items array in PixivConfig.py to load those into the _config for use in later
    ConfigItem("Authentication", "cf_clearance", ""),
    ConfigItem("Authentication", "cf_bm", ""),
  3. Add the below to use these cookies in the fanboxLoginUsingCookie function in PixivBrowserFactory.py
    if self._config.cf_clearance != "":
    ck1 = http.cookiejar.Cookie(version=0, name='cf_clearance', value=self._config.cf_clearance, port=None,
                                port_specified=False, domain='fanbox.cc', domain_specified=False,
                                domain_initial_dot=False, path='/', path_specified=True,
                                secure=False, expires=None, discard=True, comment=None,
                                comment_url=None, rest={'HttpOnly': None}, rfc2109=False)
    self.addCookie(ck1)
    if self._config.cf_bm != "":
    ck2 = http.cookiejar.Cookie(version=0, name='__cf_bm', value=self._config.cf_bm, port=None,
                                port_specified=False, domain='fanbox.cc', domain_specified=False,
                                domain_initial_dot=False, path='/', path_specified=True,
                                secure=False, expires=None, discard=True, comment=None,
                                comment_url=None, rest={'HttpOnly': None}, rfc2109=False)
    self.addCookie(ck2)
shinji257 commented 3 months ago

Yes. I'm still getting the cloudflare challenge here when using the default build. If I VPN to Japan the challenge is subverted.

calboi91 commented 3 months ago

If this doesn't get removed, I have a local version now that is able to load the cf_clearance and __cf_bm cookies copied from a browser that has bypassed the check, which works as long as you mirror the user-agent of that browser in config.ini. I doubt those cookies would stay valid between download sessions for very long but at I was at least able to update rips of all my paid accounts through pixivutil once I copied them over.

Thanks for this work around. I was also able to download my supported fanbox users by adding those two cookies to my config and then using them in the fanboxLoginUsingCookie function in my local copy.

How do you add the other cookies to the config? Right now there is only the cookieFanbox value that is the FANBOXSESSID.

They're not currently supported/added by main branch, I had to add them in and also edit the cookie loading function to allow adding in a cookie value beyond the session id, as the current method resets the cookie jar on use and only sets that one cookie type. I just knew about these being a thing from my work with gallery-dl code so I knew that as long as you match those and the user-agent you can often bypass those captcha checks at least in the short term.

how and where did you edit the cookie loading function?

Is this still needed? I wasn't getting any Cloudflare when accessing from my location anymore.

For reference, the way I did it for my local copy is noted below. Pretty hacky IMO, but it was working when I was still getting Cloudflare for Fanbox. This only adds the cookies for Fanbox, not Pixiv.

1. Add the below 2 entries under the Authentication header in the `config.ini`
[Authentication]
cf_clearance = <cf_clearance COOKIE FROM BROWSER>
cf_bm = <__cf_bm COOKIE FROM BROWSER>
2. Add the below two lines in the `class PixivConfig()`  in the `__items` array in `PixivConfig.py` to load those into the _config for use in later
ConfigItem("Authentication", "cf_clearance", ""),
ConfigItem("Authentication", "cf_bm", ""),
3. Add the below to use these cookies in the `fanboxLoginUsingCookie` function in `PixivBrowserFactory.py`
if self._config.cf_clearance != "":
    ck1 = http.cookiejar.Cookie(version=0, name='cf_clearance', value=self._config.cf_clearance, port=None,
                                port_specified=False, domain='fanbox.cc', domain_specified=False,
                                domain_initial_dot=False, path='/', path_specified=True,
                                secure=False, expires=None, discard=True, comment=None,
                                comment_url=None, rest={'HttpOnly': None}, rfc2109=False)
    self.addCookie(ck1)
if self._config.cf_bm != "":
    ck2 = http.cookiejar.Cookie(version=0, name='__cf_bm', value=self._config.cf_bm, port=None,
                                port_specified=False, domain='fanbox.cc', domain_specified=False,
                                domain_initial_dot=False, path='/', path_specified=True,
                                secure=False, expires=None, discard=True, comment=None,
                                comment_url=None, rest={'HttpOnly': None}, rfc2109=False)
    self.addCookie(ck2)

Thanks for this. I added this code to my local copy and it works now.

usernameisalreadytakendammit commented 3 months ago

Sry, I'm new to this at what line do you have to add code 3? because when I try it, i still get

in fanboxLoginUsingCookie if '"user":{"isLoggedIn":true' in str(parsed.decode('utf-8')): ^^^^^^^^^^^^^ AttributeError: 'str' object has no attribute 'decode'. Did you mean: 'encode'? Unknown Error, please check the log file: (<class 'AttributeError'>, AttributeError("'str' object has no attribute 'decode'"), <traceback object at 0x0000017D42965BC0>)

calboi91 commented 3 months ago

If this doesn't get removed, I have a local version now that is able to load the cf_clearance and __cf_bm cookies copied from a browser that has bypassed the check, which works as long as you mirror the user-agent of that browser in config.ini. I doubt those cookies would stay valid between download sessions for very long but at I was at least able to update rips of all my paid accounts through pixivutil once I copied them over.

Thanks for this work around. I was also able to download my supported fanbox users by adding those two cookies to my config and then using them in the fanboxLoginUsingCookie function in my local copy.

How do you add the other cookies to the config? Right now there is only the cookieFanbox value that is the FANBOXSESSID.

They're not currently supported/added by main branch, I had to add them in and also edit the cookie loading function to allow adding in a cookie value beyond the session id, as the current method resets the cookie jar on use and only sets that one cookie type. I just knew about these being a thing from my work with gallery-dl code so I knew that as long as you match those and the user-agent you can often bypass those captcha checks at least in the short term.

how and where did you edit the cookie loading function?

Is this still needed? I wasn't getting any Cloudflare when accessing from my location anymore. For reference, the way I did it for my local copy is noted below. Pretty hacky IMO, but it was working when I was still getting Cloudflare for Fanbox. This only adds the cookies for Fanbox, not Pixiv.

  1. Add the below 2 entries under the Authentication header in the config.ini
[Authentication]
cf_clearance = <cf_clearance COOKIE FROM BROWSER>
cf_bm = <__cf_bm COOKIE FROM BROWSER>
  1. Add the below two lines in the class PixivConfig() in the __items array in PixivConfig.py to load those into the _config for use in later
ConfigItem("Authentication", "cf_clearance", ""),
ConfigItem("Authentication", "cf_bm", ""),
  1. Add the below to use these cookies in the fanboxLoginUsingCookie function in PixivBrowserFactory.py
if self._config.cf_clearance != "":
    ck1 = http.cookiejar.Cookie(version=0, name='cf_clearance', value=self._config.cf_clearance, port=None,
                                port_specified=False, domain='fanbox.cc', domain_specified=False,
                                domain_initial_dot=False, path='/', path_specified=True,
                                secure=False, expires=None, discard=True, comment=None,
                                comment_url=None, rest={'HttpOnly': None}, rfc2109=False)
    self.addCookie(ck1)
if self._config.cf_bm != "":
    ck2 = http.cookiejar.Cookie(version=0, name='__cf_bm', value=self._config.cf_bm, port=None,
                                port_specified=False, domain='fanbox.cc', domain_specified=False,
                                domain_initial_dot=False, path='/', path_specified=True,
                                secure=False, expires=None, discard=True, comment=None,
                                comment_url=None, rest={'HttpOnly': None}, rfc2109=False)
    self.addCookie(ck2)

Sry, I'm new to this at what line do you have to add code 3? because when I try it, i still get

in fanboxLoginUsingCookie if '"user":{"isLoggedIn":true' in str(parsed.decode('utf-8')): ^^^^^^^^^^^^^ AttributeError: 'str' object has no attribute 'decode'. Did you mean: 'encode'? Unknown Error, please check the log file: (<class 'AttributeError'>, AttributeError("'str' object has no attribute 'decode'"), <traceback object at 0x0000017D42965BC0>)

I added it here:

https://github.com/Nandaka/PixivUtil2/blob/d0bfaaef0a992a4e8c5ee83a06fdf285bd272808/PixivBrowserFactory.py#L384

usernameisalreadytakendammit commented 3 months ago

guess im doing something wrong cuz its not working for me :c

biggestsonicfan commented 3 months ago

The above instructions worked for me. Let's try not to flood the issue with overly large quotes.

Someone could make a temporary fork with the code, unless @Nandaka decides to integrate this upstream somehow.

Nandaka commented 3 months ago

@biggestsonicfan sure, just send me a pull request 😄

FriedGenera commented 3 months ago

@biggestsonicfan I'm getting the same error as @usernameisalreadytakendammit after pulling the latest commit, I checked the cookies were correct and my useragent is the same as my browser

HetareKing commented 3 months ago

@FriedGenera @usernameisalreadytakendammit I had the same problem until I made my user agent more complete. So, for example, not just Mozilla/5.0, but _Mozilla/5.0 (X11; Linux x8664; rv:125.0) Gecko/20100101 Firefox/125.0

Edit: Also, make sure it's the Fanbox cookies, not the Pixiv cookies.

Zombie10 commented 3 months ago

Mozilla/5.0, but _Mozilla/5.0 (X11; Linux x8664; rv:125.0) Gecko/20100101 Firefox/125.0

@HetareKing I did what you say, still not working.

image

WhatsApp Image 2024-06-30 at 06 09 52

Screenshot 2024-06-30 at 6 03 58 AM
usernameisalreadytakendammit commented 3 months ago

Because its the last day of the month and time was running out but i still couldnt get that script to work, as a last ditch effort, i used urban vpn to get a japanese ip for free. That was the only way for me to get it to work :c

biggestsonicfan commented 3 months ago

I'm thinking the check shouldn't be a string compare anymore but maybe a json decode and checks for keys. Lemme see what I can do. It won't fix the issue, but it won't crash the program anymore and we can properly handle the error.

EDIT: So the error appears to be happening here: https://github.com/Nandaka/PixivUtil2/blob/f09af10ced47dc7fff99c7ecc8058825395e5a78/PixivBrowserFactory.py#L204-L215

self.open ends up with an urllib.error.HTTPError however since res is equal to None type, there's no error to print.

EDIT2: Ah, of course, we're getting a HTTP Error 403: Forbidden, as expected. I think we can handle that.

EDIT3: Graceful handling of the Cloudflare CAPTCHA challenge failures should be handled in #1345.

fireattack commented 3 months ago

I had the same problem until I made my user agent more complete. So, for example, not just Mozilla/5.0, but _Mozilla/5.0 (X11; Linux x8664; rv:125.0) Gecko/20100101 Firefox/125.0

People use useragent = Mozilla/5.0 to workaround another block Pixiv has on login. Indeed, if I change it to my full UA, I failed very early at doLogin() with "Cannot Login!".

Now, keeping it as-is, and added cf_clearance and cf_bm cookies in config, I still can't download fanbox just like @Zombie10

localappdata commented 3 months ago

Using @biggestsonicfan's PR: Failed FANBOX Cloudflare CAPTCHA challenge, please check your cookie and user-agent settings.

cf_clearance and cf_bm both set to cf_clearance and __cf_bm values, checked PHPSESSID, should I change the UA as well? It's still Mozilla/5.0 here.

Edit: Changed the UA to my browser's full UA, Fanbox now works again. Thanks!

(Might want to update the version shown by the way, it's still using 20230105.)

Edit2: Beam resets every hour which triggers the captcha again and both cookie values have to be set again manually or the user will run into the above error.

MarqFJA87 commented 3 months ago

I cannot find the cf_clearance cookie at all. Where is it supposed to be found?

EDIT: Found it using the developer tools. Doesn't work, though. And changing my UA results in an immediate failure to login at all.

gnarf1975 commented 3 months ago

Can you please write occasionally in a way that non-experts can understand? Or how about a proper update after all this time? Because not everyone knows about this stuff.

Thanks

MarqFJA87 commented 3 months ago

@localappdata

Edit: Changed the UA to my browser's full UA, Fanbox now works again. Thanks!

How did you avoid having the Pixiv login failure that would result from using the full UA?

biggestsonicfan commented 3 months ago

Can you please write occasionally in a way that non-experts can understand?

Fanbox now uses a protection service against scraping. In order to use PixivUtil2 again with Fanbox, you must add new cookies.

Or how about a proper update after all this time? Because not everyone knows about this stuff.

I am not a maintainer, I submit code updates, but I can not produce a "release" executable if that is what you mean about a "proper update"

MarqFJA87 commented 3 months ago

@biggestsonicfan Do you have any alternatives to using the full UA for those who need to set it to "Mozilla/5.0" to get around Pixiv blocking login attempts, as fireattack had mentioned?

nomakewan commented 3 months ago

I can confirm that with the new code, I am able to log into fanbox with the full UA intact. I previously had switched to the "Mozilla/5.0" UA to get around the login error. So it looks like that bypass is no longer necessary if you're using the proper cloudflare credentials along with your session ID.

MarqFJA87 commented 3 months ago

@nomakewan Then it would be helpful if you could list all the steps that you took to accomplish this in plain English, because I've done my best to figure it out amid the mess of technical jargon and sometimes ambiguous phrasing, and it's not working for me.

aksskl commented 3 months ago

@MarqFJA87 I had the same issue. I could not login because of the useragent. I had been using chrome to login and get useragent and cookies while also using the "Mozilla/5.0" useragent. I downloaded the latest version of firefox and got useragent and cookies using that browser. Both pixiv and fanbox logins are working for me now.

for what it's worth, this is firefox on windows and the useragent is "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:127.0) Gecko/20100101 Firefox/127.0"

MarqFJA87 commented 3 months ago

@aksskl I tried that just now. Still the same error.

nomakewan commented 3 months ago

@nomakewan Then it would be helpful if you could list all the steps that you took to accomplish this in plain English, because I've done my best to figure it out amid the mess of technical jargon and sometimes ambiguous phrasing, and it's not working for me.

You are wading into a technical discussion on github regarding python source code, FYI. But sure, why not.

Your issue is probably because you have no idea what your user agent is. You should type "what is my user agent" into google, which will then tell you the user agent for the browser you are currently using. "Mozilla/5.0" and whatever some other rando on github is using will not help you, because chances are their UA (this is short for user agent) will not match yours.

Literally all I did was copy the cookie information like normal into config.ini and change my UA to the full correct one of the browser I was currently logged into Fanbox with and it worked first try. I didn't do anything fancy whatsoever.

shinji257 commented 3 months ago

I'm going to add that because of the way that Cloudflare pairs it's cookies with your UA you will need to get the UA from the same browser as which you are getting the various Cloudflare related cookies from.

MarqFJA87 commented 3 months ago

@nomakewan @shinji257 Oh I did google "what is my user agent". It immediately ran into the Pixiv error failure issue over and over again, no matter what I did, until I followed aksskl's suggestion to fetch it (and the Cloudflare-related cookies) through Firefox... and all that did is put me back to square one, as I keep getting the error that is mentioned in the OP.

And yes, I did grab the Pixiv cookie from Firefox while I was at it, just to be sure.

biggestsonicfan commented 3 months ago

For what it's worth (not much apparently), I am able to download both Pixiv and Fanbox images with my full user agent.

In a last ditch effort, what you can try is logging out of both Pixiv and Fanbox, clearing the cookies/data (of just those sites), log into them again, then copy the new cookies back into config.ini.

I keep getting the error that is mentioned in the OP

If you are getting the error mentioned in the OP, you need to download the latest version of PixivUtil2 from the repo. That message should no longer occur.

MarqFJA87 commented 3 months ago

If you are getting the error mentioned in the OP, you need to download the latest version of PixivUtil2 from the repo. That message should no longer occur.

Are you talking about the one in releases (which I already got long ago)? Or are you asking me to download the latest master (which I also already did in an attempt to get the changes for fixing this issue)?

biggestsonicfan commented 3 months ago

Latest master. You cannot get AttributeError: 'str' object has no attribute 'decode' in latest master.

MarqFJA87 commented 3 months ago

Well, I just redownloaded and tried. No dice.

Unless I'm doing it wrong by clicking on "Code" then downloading and unpacking the zip.

biggestsonicfan commented 3 months ago

You are getting AttributeError: 'str' object has no attribute 'decode'? Can I see a screenshot? You can also copy and paste the traceback as well, that would be sufficient for me to help debug.

MarqFJA87 commented 3 months ago

Sure. Fair warning, though, if you want to test it yourself with that particular Fanbox post: it's not outright porn, but I wouldn't want to open it in public either.

Screenshot 2024-07-02 012416

biggestsonicfan commented 3 months ago

Okay, I'm betting now that you're getting a different type of cloudflare error, which boggles my mind, but let me create a fork that will dump whatever page you are getting so we can see what's happening.

MarqFJA87 commented 3 months ago

Oh come on. And I had just been enjoying the sudden vanishing of a wholly different mysterious problem that impeded my ability to download from Fantia, Skeb and other sites - among other kinds of access-related issues - without using a VPN.

I hope that we can get to the bottom of this soon.

biggestsonicfan commented 3 months ago

Go ahead and try my fork here and see if it spits out a An unknown FANBOX error occured - Cannot login.html file in your PixivUtil2 directory. If it does, maybe that can help us figure out what's going on.

MarqFJA87 commented 3 months ago

It did not spit out any such file, unfortunately.

biggestsonicfan commented 3 months ago

Can you move your logs to a different folder (or clear them if you're okay with that) and send me a fresh log of just trying to download one fanbox item? You can remove your id or anything else in the log you don't want others to see but I'm curious if there's anything else in there that will help me figure this out. I cannot reproduce this.

MarqFJA87 commented 3 months ago

Here you go.

pixivutil.log

biggestsonicfan commented 3 months ago

2024-07-02 02:34:35,640 - PixivUtil20230105 - INFO - Starting with argument: [D:\Programs\Bulk downloaders\PixivUtil2\PixivUtil2.exe].

I think I found the problem. Lol. We are all using the Python interpreter to run PixivUtil2.py. I had no idea running PixivUtil2.exe would attempt to try to execute scripts in the same directory.

Executing PixivUtil2.exe in the same directory as the up-to-date master python scripts does indeed throw the same error you are getting.

MarqFJA87 commented 3 months ago

Wait, what??? Then what am I supposed to execute, then?!