JimmXinu / FanFicFare

FanFicFare is a tool for making eBooks from stories on fanfiction and other web sites.
Other
758 stars 162 forks source link

Cloudflare and FanFiction.net doing its thing again #703

Closed chocolatechipcats closed 2 years ago

chocolatechipcats commented 3 years ago

FanFicFare 4.3.0, Calibre running from source.

Attempting to update from FFNet gives out a Cloudflare 2 error. Several other users have reported this on MobileRead. Cache works as expected. Checked Twitter and there's no indication that FictionPress is working on an update, but I will keep monitoring.

Not much to be done about it (there doesn't seem to be any recent updates to the scraper), but at least here's a report to monitor the issue.

Twilight666 commented 3 years ago

I don't know if it is the same thing but I get the following error when using CLI:

cloudscraper.exceptions.CloudflareCaptchaProvider: Cloudflare Captcha detected, unfortunately you haven't loaded an anti Captcha provider correctly via the 'captcha' parameter.

During handling of the above exception, another exception occurred:

fanficfare.exceptions.FailedToDownload: cloudscraper reports: "Cloudflare Captcha detected, unfortunately you haven't loaded an anti Captcha provider correctly via the 'captcha' parameter."

It looks like there is a change in cloudscraper's use and it needs a new option to be set. Or maybe it's an old option and now it raises an error... I don't know

JimmXinu commented 3 years ago

It looks like ffnet has increased their Cloudflare blocking level again.

As far as I know, there isn't anything more we can do in FFF than there was the last time this happened.

I suggest reviewing these resources:

FAQ: https://github.com/JimmXinu/FanFicFare/wiki/FAQs#why-am-i-having-errors-downloading-from-fanfictionnet--why-am-i-getting-cloudflare-errors-downloading-from-fanfictionnet

Browser Cache Feature: https://github.com/JimmXinu/FanFicFare/wiki/BrowserCacheFeature

Browser Proxy (third party): https://github.com/nsapa/fanfictionnet_ff_proxy

NightMachinery commented 3 years ago

@JimmXinu We can use puppeteer though? I just tested my https://github.com/NightMachinary/.shells/blob/master/scripts/javascript/curlfull.js , and it worked fine. Using the browser cache is not a very good solution (I use fanficfare on a server).

NightMachinery commented 3 years ago

@JimmXinu A general workaround is to let the user supply a downloader, like youtube-dl's --external-downloader, if you don't like bundling puppeteer.

JimmXinu commented 3 years ago

First off, I'm not doing anything about this in a hurry. More than once, we've seen ffnet's CF level go up for a while and then go back down after a while--think days, not hours.

Second, Cloudflare is a service specifically for detecting and blocking automated access. I assume (and past evidence agrees) that they can already block well known automated browser tools. If I recall correctly, earlier in the year, we experimented with headless proxy browsers and all of them were blocked after a few requests.

That's why our current generation of work arounds are based on using a live browser.

Twilight666 commented 3 years ago

I was already using Cache... but since I reread the wiki I tried and I downloaded the entire story using WebToEpub and it downloaded the whole story. But when I immediately afterwards used FFF is didn't work

chocolatechipcats commented 3 years ago

I was already using Cache... but since I reread the wiki I tried and I downloaded the entire story using WebToEpub and it downloaded the whole story. But when I immediately afterwards used FFF is didn't work

Try use_browser_cache_only:true to prevent it falling back to the site. I'm assuming it's some user error on my end, but for whatever reason I have trouble getting it to use the cache if I only set use_browser_cache:true.

timgblack commented 3 years ago

Is it worth - if ffn keep this up - investigating how their app bypasses these checks?

mcepl commented 3 years ago

Try use_browser_cache_only:true to prevent it falling back to the site. I'm assuming it's some user error on my end, but for whatever reason I have trouble getting it to use the cache if I only set use_browser_cache:true.

This is what I've tried (using FanFicFare 4.3.0 and Firefox 88):

  1. Download https://www.fanfiction.net/s/13747780/1/ with WebToEpub
  2. Add browser_cache_path:/home/matej/.cache/mozilla/firefox/uxi2c7cz.default/cache2 to [defaults] in personal.ini
  3. Run:
    
    tmp@stitny$ fanficfare -o use_browser_cache_only=true https://www.fanfiction.net/s/13747780/1/
    cloudscraper.exceptions.CloudflareChallengeError: Detected a Cloudflare version 2 Captcha challenge, This feature is not available in the opensource (free) version.

During handling of the above exception, another exception occurred:

fanficfare.exceptions.FailedToDownload: cloudscraper reports: "Detected a Cloudflare version 2 Captcha challenge, This feature is not available...."


Something’s wrong. Why even with `use_browser_cache_only=true` I get Internet connection?
JimmXinu commented 3 years ago

Do you also have use_browser_cache:true under [www.fanfiction.net]? use_browser_cache_only:true doesn't do anything without use_browser_cache:true

mcepl commented 3 years ago

Do you also have use_browser_cache:true under [www.fanfiction.net]? use_browser_cache_only:true doesn't do anything without use_browser_cache:true

Which is a bit weird, isn't it? Anyway, this works.

fanficfare -o use_browser_cache=true -o use_browser_cache_only=true -o use_cloudscraper=false \
    https://www.fanfiction.net/s/13747780/1/
chocolatechipcats commented 3 years ago

I'm assuming it's some user error on my end, but for whatever reason I have trouble getting it to use the cache if I only set use_browser_cache:true.

Just for posterity, check_next_chapter:true seemed to be the cause of this.

eltonfreak commented 3 years ago

I have also recieved the following errors in the last 24 hours.

Status Title Author Comment URL Error Unknown Unknown cloudscraper reports: "Detected a Cloudflare version 2 Captcha challenge, This feature is not available...." https://www.fanfiction.net/s/5598642/1/ Error Unknown Unknown cloudscraper reports: "Detected a Cloudflare version 2 Captcha challenge, This feature is not available...." https://www.fanfiction.net/s/5244813/1/ Error Unknown Unknown cloudscraper reports: "Detected a Cloudflare version 2 Captcha challenge, This feature is not available...." https://www.fanfiction.net/s/5012016/1/Partners

mcepl commented 3 years ago

I have also recieved the following errors in the last 24 hours.

Yes, Cloudflare is acting again, and there is nothing we can do about it. Workaround with WebToEpub is the only thing I know to work.

erd00073 commented 3 years ago

Is it worth - if ffn keep this up - investigating how their app bypasses these checks?

The author of at least one other 3rd party reader app I know of has already looked at it and states that the code at issue is a totally obfuscated mess that he has no intention of trying to dig into. If you want to try, feel free. But, keep in mind, even if you manage to figure something out, as soon as more than a few people try to use it they'll just update the app and invalidate everything you did.

The owner of FFN is focused on getting his ad revenue, and going to war with a company like Cloudflare is a zero sum fight you aren't going to win in the end. The only thing you can try to do is work around them like Jimm is with the FFF browser cache feature. I just hope FFN doesn't try to implement some form of cache busting technique.

Edocsil commented 3 years ago

Is it worth - if ffn keep this up - investigating how their app bypasses these checks?

The author of at least one other 3rd party reader app I know of has already looked at it and states that the code at issue is a totally obfuscated mess that he has no intention of trying to dig into. If you want to try, feel free. But, keep in mind, even if you manage to figure something out, as soon as more than a few people try to use it they'll just update the app and invalidate everything you did.

The owner of FFN is focused on getting his ad revenue, and going to war with a company like Cloudflare is a zero sum fight you aren't going to win in the end. The only thing you can try to do is work around them like Jimm is with the FFF browser cache feature. I just hope FFN doesn't try to implement some form of cache busting technique.

The official FFN app doesn't have ads, so whatever reason they have for this can't be ad revenue.

JimmXinu commented 3 years ago

Rather than believing that ffnet has deliberately set out to ruin our collective day, I suspect that they are using Cloudflare for protection from aggressive, automated systems attacking their site.

Unfortunately, individual users with automated download software (IE, us) are extremely similar in appearance from their POV.

I'll add what I said over on MR recently:

I have lost basically all interest in fighting with ffnet about this. Browser cache works well enough for me for the few active authors I still read on ffnet.

Unless someone comes up with a radical new idea, I'm not thinking about it anymore right now.

Many thanks to the members here are willing and able to help each other.

mcepl commented 3 years ago

I have lost basically all interest in fighting with ffnet about this. Browser cache works well enough for me for the few active authors I still read on ffnet.

I basically agree with what you said, with one exception (and no I have no idea how to do it): at least checking that the work has new update. I have still 776 FFN stories (yes, trying to move to AO3 as much as possible, but this is the best I have managed so far) and I have no clue how to check that I am not missing on new chapters. Update itself is done easily with browser cache.

chocolatechipcats commented 3 years ago

If you have an ffnet account, you can go to Account Page > Alerts > Story Alerts and compare the dates of followed stories to your local copies. Setting up Calibre with a last-updated column will help too.

quihi commented 3 years ago

I have a discord bot which has been able to access FFN most of the time recently. Is fanficfare using the latest version of cloudscraper?

On Jun 14, 2021, at 5:14 PM, chocolatechipcats @.***> wrote:  If you have an ffnet account, you can go to Account Page > Alerts > Story Alerts and compare the dates of followed stories to your local copies. Setting up Calibre with a last-updated column will help too.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

chocolatechipcats commented 3 years ago

Cloudscraper's last update was on April 9, and I believe it's already been incorporated into FFF. Also, the error specifically mentions the version 2 CAPTCHA, which the Cloudscraper developer has behind a paywall (this was discussed in the last thread).

MrTyton commented 3 years ago

Has anyone even gotten the paywall-d version to work? At this point I'd be willing to shell out a few bucks, but not if it doesn't work with Fanficfare.

mcepl commented 3 years ago

Has anyone even gotten the paywall-d version to work? At this point I'd be willing to shell out a few bucks, but not if it doesn't work with Fanficfare.

I haven’t found a paid version. Is there a one?

chocolatechipcats commented 3 years ago

It was discussed in the last thread: https://github.com/JimmXinu/FanFicFare/issues/622#issuecomment-756479078

kov9413tam commented 3 years ago

The Google Chrome extension: WebToEpub still can download from FF.net.

https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode

mcepl commented 3 years ago

The Google Chrome extension: WebToEpub still can download from FF.net.

Works in Firefox as well. But the problem is that some more sophisticated things FFF can do, this extension cannot (namely, --update-epub).

arisboch commented 3 years ago

@mcepl Better than nothing, especially if one can't use that Windows program to download stuff from FFN, but has a Firefox, Chrome or Chrome-based browser.

timgblack commented 3 years ago

I run FanFicFare on a headless Raspberry Pi and submit URLs from my phone, so using any cache isn't viable unfortunately. For now, I've just stopped it checking for updates and I'm seriously considering whether it's worth decompiling the app and seeing if the same method it uses is viable (or if using the app cache is viable)...

CodeyAce95 commented 3 years ago

does any one know if there is a fanfiction.net app that let us read like a book instead of scrooling if so please let me know so ican read my books intill this problem is solved beacse the scrolling on fanfiction.net makes me sick

chocolatechipcats commented 3 years ago

FanFiction.net has an official app. From what I understand all other ones are blocked.

CodeyAce95 commented 3 years ago

[s there a there away around it

Edocsil commented 3 years ago

does any one know if there is a fanfiction.net app that let us read like a book instead of scrooling if so please let me know so ican read my books intill this problem is solved beacse the scrolling on fanfiction.net makes me sick

In the official app you can "turn page" with the volume keys, it's an option in the settings.

Not the best place to ask this though..

CodeyAce95 commented 3 years ago

where would i ask i am new here

JimmXinu commented 3 years ago

Google for fanfiction.net app. This is off-topic for both this project and this issue.

Xyverz commented 3 years ago

It looks like ffnet has increased their Cloudflare blocking level again.

As far as I know, there isn't anything more we can do in FFF than there was the last time this happened.

I suggest reviewing these resources:

FAQ: https://github.com/JimmXinu/FanFicFare/wiki/FAQs#why-am-i-having-errors-downloading-from-fanfictionnet--why-am-i-getting-cloudflare-errors-downloading-from-fanfictionnet

Browser Cache Feature: https://github.com/JimmXinu/FanFicFare/wiki/BrowserCacheFeature

Browser Proxy (third party): https://github.com/nsapa/fanfictionnet_ff_proxy

Hey @JimmXinu, I've read your BrowserCacheFeature page several times and nowhere in that does it say where this personal.ini file I'm supposed to edit is located. I'm getting more and more frustrated the more I try to implement this solution. Can you please update the wiki page or at least inform me where this file is located?

Thanks in advance!

JimmXinu commented 3 years ago

There's a separate wiki page about INI and the first thing in it is INI File Location.

But I will add a link from the BrowserCacheFeature page.

Xyverz commented 3 years ago

There's a separate wiki page about INI and the first thing in it is INI File Location.

But I will add a link from the BrowserCacheFeature page.

Thank you sir! Much appreciated!

erd00073 commented 3 years ago

There's a separate wiki page about INI and the first thing in it is INI File Location. But I will add a link from the BrowserCacheFeature page.

Thank you sir! Much appreciated!

Click the down arrow by the FFF button in Calibre, and select the option to configure FFF. On the configuration page that pops up, click the Personal.ini tab. There is an "Edit Personal.ini" button there, and once it is open for editing there is a save button at the bottom of the edit window.

One piece of advice I learned the hard way - be very careful using when using copy/paste into Personal.ini. Somehow (as near as I can tell, anyway) I managed to paste some type of invisible embedded character in there. It caused me all sorts of random misery with FFF for weeks that I just couldn't fix before ownedbycats on the MobileRead site pointed out to me how to reset Personal.ini to defaults.

CodeyAce95 commented 3 years ago

has any one figured how to get around this i do not care about downloading but i want to update my stories any help would be welcome thank you and god bless

JimmXinu commented 3 years ago

The only new information is that the Browser Cache work around can also use the cache created by FanFictionDownloader.

LoisGNS commented 3 years ago

That's terrific news, since it should allow more batching - I think FFD will download whatever I feed it if I give it a list in a file, but can just delete those and re-feed the list into FFF. Which version of FFF do I need to enable the FFD cache? I'm currently using 4.3.0.

So what I would do is have use FFF to create the list (read it from a web page or drag/drop the desired items from the page onto its download list), save the resulting list to a .txt file, which I then feed to FFD; let it do its thing to populate the cache, then run the FFF download-from-url action, and finally delete the redundant FFD files. Make sense?

CodeyAce95 commented 3 years ago

would any one have direct instructions for a total noob with pictures if possible i am autistic

LoisGNS commented 3 years ago

@CodeyAce95, are you looking for instructions regarding using FFD with FFF, along the lines I described above, just specific parts of the process, or something else? For example, do you need help with finding and setting up FFD (looks like you already have and use FFF), or just the bits I described that suggest what might work by using them together? There is a download link to FFD in JimmXinu's post above if you don't already have it.

I haven't yet tried what I described, but once I do (and confirm it works), I could probably put together a Word document with screen-grabs, if that would be helpful - and if this site will let me attach the doc to a message here. It seems like it allows attachments, I just don't what restrictions, if any, there might be to the types or sizes of the attachments. It may take me a few days to get to it, since I'm behind on some stuff at the moment :)

Edocsil commented 3 years ago

Alternatively, you can download ffnet fics using a different tool, like ff2ebook or fichub, personally I'm currently using FicHub-cli and it works fine.

This tools get around cloudfare using FlareSolverr, but I don't know if this would work with FFF. In the past ff2ebook just used cookies and it worked too, so that's something that someone could try.

CodeyAce95 commented 3 years ago

@CodeyAce95, are you looking for instructions regarding using FFD with FFF, along the lines I described above, just specific parts of the process, or something else? For example, do you need help with finding and setting up FFD (looks like you already have and use FFF), or just the bits I described that suggest what might work by using them together? There is a download link to FFD in JimmXinu's post above if you don't already have it.

I haven't yet tried what I described, but once I do (and confirm it works), I could probably put together a Word document with screen-grabs, if that would be helpful - and if this site will let me attach the doc to a message here. It seems like it allows attachments, I just don't what restrictions, if any, there might be to the types or sizes of the attachments. It may take me a few days to get to it, since I'm behind on some stuff at the moment :)

yes i think that might be just what i need thank you and god bless

JimmXinu commented 3 years ago

We've discussed FlareSolverr before. It starts a proxy process that starts a puppeteer headless browser process. nsapa's Browser Proxy is effectively similar and already working.

However, since FlareSolverr offers binary downloads (for win and linux), I may attempt using it. But I would point out the following from FlareSolverr's home page about captcha solvers:

⚠️ At this time none of the captcha solvers work. You can check the status in the open issues. Any help is welcome.


The fichub projects listed are frontends to an AJAX API server. I don't see a public repository for the actual download code. By running all downloads through their own server, all issues with running headless browsers are at least concentrated in one place.

FFF used to have a web version which operated under somewhat similar basic architecture, but it's been retired:

I continued to support the web service in recent years as a legacy for the users who can't run the CLI or Calibre versions. But I'm not interested in spending my money on it, or dealing with the accounting and possible tax implications of collecting donations to run it.


@LoisGNS - If you are willing to put together words and pictures for a user guide on using FFDL and FFF together, I would be willing to convert it to markdown and add it to the project as a wiki page--properly credited, of course.

LoisGNS commented 3 years ago

I started such a document, doing screen-grabs as I went. But it didn't work. I am using ver 4.5.0 of FFF and 0.9.2 of FFD. Do I need a different version of one or both?

JimmXinu commented 3 years ago

You probably need the latest FanFictionDownloader, v0.9.4.

chocolatechipcats commented 3 years ago

I can confirm it works with 0.9.4.

Since I made the same mistake, check that you're using the right cache folder. There's \cache\ but you actually want \cache\QtWebEngine\Default\Cache\. Mine has several data_# files and an index.

LoisGNS commented 3 years ago

I just tried after updating FFD to 0.9.4, and it still isn't working for me.

@chocolatechipcats, are you saying I need to change the browser_cache_path in personal.ini from the one I use when just clicking through in FireFox? That one is C:\Users[myusername]\AppData\Local\Mozilla\Firefox\Profiles[myfolder].default\cache2, which has two subfolders: "doomed" and "entries" and a file called "Index" with no extension. There's lots of stuff under Entries. This works with the browser-cache method, and is the one that was described in the instructions for it.

I don't find any folder like the one you describe. I looked in c:\users (etc.), as well as the FFD and Calibre folders. I also asked Windows to search for it, and it didn't find it either.

There is also a Chrome browser cache shown as an option in the instructions, browser_cache_path:C:\Users\YourUser\AppData\Local\Google\Chrome\User Data\Default\Cache or trying another profile name in place of "default". I don't have a Chrome profile name other than "default," so tried that, which didn't work either.

I'm stumped now.