Closed mcepl closed 3 years ago
Download finished with:
===================
!!!! 3 chapters errored downloading https://www.fanfiction.net/s/13626086 !!!!
===================
Hi,
Yes, I've an issue with truncated data from the proxy to fanficfare.
Please retry with the current commit and after applying the fffa.patch. It helped a lot on my computer.
Best regards, Nicolas SAPA
Unfortunately, it absolutely didn't help. Chromium (using chromium-91.0.4472.101-1.1.x86_64 from openSUSE packages) throws instead of example.com special page:
Logo Chromium
Autoři prohlížeče Chromium
Copyright 2021 Autoři prohlížeče Chromium. Všechna práva vyhrazena.
Chromium 91.0.4472.101 (openSUSE Build) (64bitový)
Verze af52a90bf87030dd1523486a1cd3ae25c5d76c9b-refs/branch-heads/4472@{#1462}
Operační systém Linux
JavaScript V8 9.1.269.36
User agent Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36
Příkazový řádek /usr/lib64/chromium/chromium --password-store=detect --enable-threaded-compositing --ui-disable-partial-swap --allow-pre-commit-input --disable-background-networking --disable-client-side-phishing-detection --disable-default-apps --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --enable-automation --enable-blink-features=ShadowDOMV0 --enable-logging --log-level=0 --no-first-run --no-service-autorun --password-store=basic --remote-debugging-port=0 --test-type=webdriver --use-mock-keychain --user-data-dir=/tmp/.com.google.Chrome.FrMuyy --flag-switches-begin --flag-switches-end data:,
Spustitelná cesta /usr/lib64/chromium/chromium
Cesta k profilu /tmp/.com.google.Chrome.FrMuyy/Default
Is it some kind of about:
page?
The proxy output:
fanfictionnet_ff_proxy@stitny (master)$ python3 chrome_content.py
2021-06-22 07:13:03.299 CEST INFO root fanfictionnet_ff_proxy version 0.4 by Nicolas SAPA <nico@byme.at>
2021-06-22 07:13:03.299 CEST INFO root This Alpha software is licensed under CECILL-2.1
2021-06-22 07:13:03.303 CEST INFO root Running on Linux-5.12.10-1-default-x86_64-with-glibc2.2.5
2021-06-22 07:13:03.456 CEST INFO undetected_chromedriver Selenium patched. Safe to import Chrome / ChromeOptions
2021-06-22 07:13:03.738 CEST INFO undetected_chromedriver Selenium patched. Safe to import Chrome / ChromeOptions
2021-06-22 07:13:05.866 CEST INFO undetected_chromedriver starting undetected_chromedriver.Chrome((), {'service_log_path': '/dev/null', 'chrome_options': <selenium.webdriver.chrome.options.Options object at 0x7fcb189b3f10>, 'executable_path': './chromedriver', 'options': <selenium.webdriver.chrome.options.Options object at 0x7fcb189b3f40>})
2021-06-22 07:13:05.867 CEST INFO prepare_Chrome Chrome 91.0.4472.101 on linux started
2021-06-22 07:13:06.144 CEST INFO prepare_Chrome chromedriver version 91.0.4472.101 running as pid 22389, Chrome running as pid 22395
2021-06-22 07:13:06.611 CEST INFO prepare_Chrome Trying to load existing cookie...
2021-06-22 07:13:06.611 CEST INFO root Chrome is initialized & ready to works
2021-06-22 07:13:06.611 CEST INFO root Listening on 127.0.0.1:8888
2021-06-22 07:13:50.173 CEST INFO mainloop Current URL = https://www.fanfiction.net/s/4357909/1/A-Kiss-Can-Save-The-World, page title = A Kiss Can Save The World Chapter 1, a harry potter fanfic | FanFiction, mimetype = text/html
2021-06-22 07:13:57.426 CEST INFO mainloop Current URL = chrome://version/, page title = O verzi aplikace, mimetype = text/html
2021-06-22 07:14:05.428 CEST INFO mainloop Current URL = https://www.fanfiction.net/s/4357909/1/A-Kiss-Can-Save-The-World, page title = A Kiss Can Save The World Chapter 1, a harry potter fanfic | FanFiction, mimetype = text/html
2021-06-22 07:14:12.666 CEST INFO mainloop Current URL = chrome://version/, page title = O verzi aplikace, mimetype = text/html
2021-06-22 07:14:20.744 CEST INFO mainloop Current URL = https://www.fanfiction.net/s/4357909/1/A-Kiss-Can-Save-The-World, page title = A Kiss Can Save The World Chapter 1, a harry potter fanfic | FanFiction, mimetype = text/html
2021-06-22 07:14:32.984 CEST INFO mainloop Current URL = chrome://version/, page title = O verzi aplikace, mimetype = text/html
2021-06-22 07:14:41.014 CEST INFO mainloop Current URL = https://www.fanfiction.net/s/4357909/1/A-Kiss-Can-Save-The-World, page title = A Kiss Can Save The World Chapter 1, a harry potter fanfic | FanFiction, mimetype = text/html
2021-06-22 07:14:58.276 CEST INFO mainloop Current URL = chrome://version/, page title = O verzi aplikace, mimetype = text/html
2021-06-22 07:15:06.308 CEST INFO mainloop Current URL = https://www.fanfiction.net/s/4357909/1/A-Kiss-Can-Save-The-World, page title = A Kiss Can Save The World Chapter 1, a harry potter fanfic | FanFiction, mimetype = text/html
2021-06-22 07:15:28.684 CEST INFO mainloop Current URL = chrome://version/, page title = O verzi aplikace, mimetype = text/html
Using this nsapa_proxy.py
Hi,
Yes, we are now using an internal Chrome page for resetting the proxy. Could you try this branch of fanficfare? https://github.com/nsapa/FanFicFare/tree/fix_truncated I believe I have found where the code was losing data.
Best regards, NS
Works much better, thank you.
Only one RFE: handling of the proxy is quite bother, so could it be possible make fanficfare just quit trying when the proxy is not present and not go for exploding with a backtrace. Or possibly (if you want to make the blow-up default), the additional value use_nsapa_proxy:required
, where FFF would just skip over FFnet targets when proxy is not present? I have couple of hundreds of books which I am checking with -u
parameter, some of them from less networkingly-disgusting sites than FFnet, so I would like to check more often just on these and let FFnet ones be silently (or just with a warning) to be skipped for next time.
Hi,
Great new!
FanFicFare should already do want if you want if your personal.ini is correct:
$ python fanficfare/cli.py -c personnal.ini 'https://archiveofourown.org/works/17242469'
$ python fanficfare/cli.py -c personnal.ini 'https://www.fanfiction.net/s/13118852/1/Echoes'
FFF: ERROR: 2021-06-23 23:40:39,897: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
FFF: ERROR: 2021-06-23 23:40:39,898: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
FFF: ERROR: 2021-06-23 23:40:39,898: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
FFF: ERROR: 2021-06-23 23:40:39,899: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
FFF: ERROR: 2021-06-23 23:40:39,900: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
Traceback (most recent call last):
Your personnal.ini should look like that:
$ grep -v -e ^$ personnal.ini |grep -v ^#
[defaults]
[epub]
[www.twilighted.net]
[ficwad.com]
[www.adastrafanfic.com]
[www.tthfanfic.org]
[www.fanfiction.net]
use_nsapa_proxy:true
use_cloudscraper:false
[overrides]
skip_author_cover:false
include_images:true
I am sending a PR for the fixed code. It will take some time after its merge to be in a release.
Best regards, NS
I don’t think it works (with the latest FFF https://github.com/JimmXinu/FanFicFare/commit/da800759ca6ec23decc77342babb127524f17644):
tmp@kusansky$ cat ~/.fanficfare/personal.ini
[defaults]
is_adult:true
progressbar:true
browser_cache_path:/home/matej/.cache/mozilla/firefox/uxi2c7cz.default/cache2
[epub]
include_images:true
[www.phoenixsong.net]
username:mcepl
password:XXXXXX
[archiveofourown.org]
exclude_metadata_pre:
freeformtags,genre==Other Additional Tags to Be Added
warnings==Creator Chose Not To Use Archive Warnings
warnings==No Archive Warnings Apply
username:mcepl
password:XXXXXX
[fanfiction.net]
never_make_cover: true
user_agent:Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0
use_nsapa_proxy:true
use_cloudscraper:false
[ficwad.com]
username:mcepl
password:XXXXXX
tmp@kusansky$ fanficfare -u Birds\ become\ Dragons-ffnet_12531290.epub
Updating Birds become Dragons-ffnet_12531290.epub, URL: https://www.fanfiction.net/s/12531290/1/Birds-become-Dragons
FFF: ERROR: 2021-06-24 09:52:40,803: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
FFF: ERROR: 2021-06-24 09:52:40,804: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
FFF: ERROR: 2021-06-24 09:52:40,804: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
FFF: ERROR: 2021-06-24 09:52:40,805: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
FFF: ERROR: 2021-06-24 09:52:40,806: nsapa_proxy.py(44): proxy unavailable, socket error: [Errno 111] Connection refused
Traceback (most recent call last):
File "/home/matej/.local/bin/fanficfare", line 11, in <module>
load_entry_point('FanFicFare', 'console_scripts', 'fanficfare')()
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/cli.py", line 310, in main
do_download(url,
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/cli.py", line 410, in do_download
adapter.getStoryMetadataOnly()
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/adapters/base_adapter.py", line 308, in getStoryMetadataOnly
self.doExtractChapterUrlsAndMetadata(get_cover=get_cover)
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/adapters/adapter_fanfictionnet.py", line 114, in doExtractChapterUrlsAndMetadata
data = self.get_request(url)
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/requestable.py", line 114, in get_request
return self.get_request_redirected(url,
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/requestable.py", line 106, in get_request_redirected
(data,rurl) = self.configuration.get_fetcher().get_request_redirected(
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/fetcher.py", line 390, in get_request_redirected
fetchresp = self.do_request('GET',
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/fetcher.py", line 102, in fetcher_do_request
fetchresp = chainfn(
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/fetcher.py", line 240, in fetcher_do_request
fetchresp = chainfn(
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/fetcher.py", line 136, in fetcher_do_request
fetchresp = chainfn(
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/fetcher.py", line 363, in do_request
fetchresp = self.request(method,url,
File "/home/matej/archiv/knihovna/repos/tmp/FanFicFare/fanficfare/nsapa_proxy.py", line 174, in request
raise exceptions.FailedToDownload(
fanficfare.exceptions.FailedToDownload: nsapa_proxy: truncated reply from proxy after 5 retry
tmp@kusansky$
Ho, I didn't understood what you meant. I raise fanficfare.exceptions.FailedToDownload to tell the rest of FanFicFare that the download failed. And it seem to be the only way to do that.
I think you should ask @JimmXinu about disabling this traceback.
Anyway, the truncated reply issue is fixed.
Best regards, NS
So is the request that FFF not show an error and stack trace when a fatal error (not connecting to the proxy when configured) happens? Other than the existing continue_on_chapter_error
feature, I'm not incline to do so.
I would, however, suggest that @nsapa change the final reported error. Because after 5 tries that are all Connection refused
, reporting final error as truncated reply from proxy
seems off.
I am going to get rid of the retry loop and add some local exception (TruncatedReply, UnknowType, ProxyUnreacheable, ...) Then I will raise FailedToDownload("nsapa_proxy: %s" % str(e)).
I will probably send a PR for that this weekend.
So is the request that FFF not show an error and stack trace when a fatal error (not connecting to the proxy when configured) happens? Other than the existing
continue_on_chapter_error
feature, I'm not incline to do so.
I thought more about the value required
of use_nsapa_proxy
, which would mean that somewhere FFF finds out there is no proxy running, it would just bail out of whole downloading. Of course, in the normal case, the error should be reported.
Output of fanficfare:
Output of the proxy:
The script continue so it hopefully finally manages to download whole story, but it seems like for some pages it requires large number of tries (like twenty or more).
Using: python 3.8.10, fanfitionnet_ff_proxy commit 81532e0, fanficfare 4.3.5, Linux/openSUSE/Tumbleweed