kemayo / leech

Turn a story on certain websites into an ebook for convenient reading
MIT License
154 stars 24 forks source link

Timeout on fanfiction.net #64

Closed TheMetalCenter closed 3 years ago

TheMetalCenter commented 3 years ago

Lately I've been getting this timeout error on fanfiction.net and fictionpress

>py -3.7 leech.py https://www.fanfiction.net/s/11090259/1/r-Animorphs-The-Reckoning [sites] Handler: <class 'sites.fanfictionnet.FanFictionNet'> (https://www.fanfiction.net/s/11090259/) [__main__] Unable to locate leech.json. Continuing assuming it does not exist. [sites] Load failed: waiting 10 to retry (403: https://www.fanfiction.net/s/11090259/) [sites] Load failed: waiting 10 to retry (403: https://www.fanfiction.net/s/11090259/) [sites] Load failed: waiting 10 to retry (403: https://www.fanfiction.net/s/11090259/) Traceback (most recent call last): File "leech.py", line 159, in <module> cli() File "C:\Users\X\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 722, in __call__ return self.main(*args, **kwargs) File "C:\Users\X\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 697, in main rv = self.invoke(ctx) File "C:\Users\X\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "C:\Users\X\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 895, in invoke return ctx.invoke(self.callback, **ctx.params) File "C:\Users\X\AppData\Local\Programs\Python\Python37\lib\site-packages\click\core.py", line 535, in invoke return callback(*args, **kwargs) File "leech.py", line 152, in download story = open_story(site, url, session, login, options) File "leech.py", line 103, in open_story story = handler.extract(url) File "C:\Users\X\Documents\My Programs and Apps\Web Serials\leech-master\sites\fanfictionnet.py", line 22, in extract soup = self._soup(url) File "C:\Users\X\Documents\My Programs and Apps\Web Serials\leech-master\sites\__init__.py", line 146, in _soup return self._soup(url, method=method, retry=retry - 1, retry_delay=retry_delay, **kw) File "C:\Users\X\Documents\My Programs and Apps\Web Serials\leech-master\sites\__init__.py", line 146, in _soup return self._soup(url, method=method, retry=retry - 1, retry_delay=retry_delay, **kw) File "C:\Users\X\Documents\My Programs and Apps\Web Serials\leech-master\sites\__init__.py", line 146, in _soup return self._soup(url, method=method, retry=retry - 1, retry_delay=retry_delay, **kw) File "C:\Users\X\Documents\My Programs and Apps\Web Serials\leech-master\sites\__init__.py", line 147, in _soup raise SiteException("Couldn't fetch", url) sites.SiteException: ("Couldn't fetch", 'https://www.fanfiction.net/s/11090259/')

TheMetalCenter commented 3 years ago

Downloaded more recent copy and error mentions cloudfare:

[sites] Handler: <class 'sites.fanfictionnet.FanFictionNet'> (https://www.fanfiction.net/s/9794740/) [__main__] Unable to locate leech.json. Continuing assuming it does not exist. [__main__] ("Couldn't fetch, probably because of Cloudflare protection", 'https://www.fanfiction.net/s/9794740/') [__main__] No ebook created

TheMetalCenter commented 3 years ago

And closing because I saw this was already a closed issue.

kemayo commented 3 years ago

This has reminded me to go look and see if FFN has added the API they promised for this last year...

kemayo commented 3 years ago

There's still nothing about the API. It's a bit delayed from their "late January" claims, by now. 😅

That said, f1bd28e9428691f7a957d5567bc5de3c74245ea9 gives us an experimental hack for getting some FFN content. It's piggybacking off of archive.org, so it's going to be inconsistent.

TheMetalCenter commented 3 years ago

That's a cool work around! But yeah unfortunately will only works for stories that have been archived and so probably not for a weekly update. Thankfully all the stories I'm interested in on the site are available on other sites as well.