DobyTang / LazyLibrarian

This project isn't finished yet. Goal is to create a SickBeard, CouchPotato, Headphones-like application for ebooks. Headphones is used as a base, so there are still a lot of references to it.
732 stars 71 forks source link

Libgen.io test error #1642

Closed retrosyn closed 5 years ago

retrosyn commented 5 years ago

To help with identifying and fixing issues, please include as much information as possible, including:

LazyLibrarian version number - a4fd6e6241cb9cb8427ffd732fa53a6abd13a68b

Operating system used (windows, mac, linux, NAS type) - linux

Interface in use (default, bookstrap) - bookstrap

Which api (Goodreads, GoogleBooks, both) - Goodreads

Source of your LazyLibrarian installation (git, zip file, 3rd party package) - 3rd party (installed on seedbox by one click)

Relevant debug log with api keys and any passwords redacted

These are the errors:

2018-10-28 08:41:29 DEBUG TESTPROVIDER directparser.py GEN 115 Error fetching page data from libgen.io: Exception ConnectionError: ('Connection aborted.', BadStatusLine("''",))
2018-10-28 08:41:29 DEBUG TESTPROVIDER directparser.py GEN 114 http://libgen.io/search.php?column=def&res=100&req=Agatha%2BChristie&phrase=0&open=0&view=simple
2018-10-28 08:41:29 DEBUG TESTPROVIDER providers.py test_provider 67 Testing provider GEN

The torrent provider test passed.

debug.zip

philborman commented 5 years ago

Odd error message. I did a quick google for it and it seems some providers return that error if they detect you are scraping their site. Not sure how they detect, but it's not happening here, works fine. We aren't supposed to scrape the site, should be accessed by a browser really so they get advertising revenue. Are you maybe behind a vpn they don't like or have you hit the site too many times, could be an access limit, though most providers return error 111 for limit exceeded. Might be worth putting https://libgen.io instead of the default as https might bypass the scraper detection?

retrosyn commented 5 years ago

No, I am not using a VPN but I am using a seedbox on which lazylibrarian is installed if that matters? I can access the website fine from my browser. If I put https://libgen.io I am getting this error:

2018-10-28 13:50:50 DEBUG TESTPROVIDER directparser.py GEN 115 Error fetching page data from libgen.io: Exception SSLError: HTTPSConnectionPool(host='libgen.io', port=443): Max retries exceeded with url: /search.php?column=def&res=100&req=Agatha%2BChristie&phrase=0&open=0&view=simple (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))
philborman commented 5 years ago

ok, your seedbox doesn't have ssh libraries available so that options not available to try. It certainly looks like it's a "scraper detection" issue, the site is realising you are not a browser and stopping access as the url works from a browser, but it's working for me in lazylibrarian?

Is it possible you have hit an access limit, too many calls or too many downloads? Your browser will send different id to lazylibrarian (user agent header) so might not be blocked the same. Maybe try a mirror like gen.lib.rus.ec or try 93.174.95.27

retrosyn commented 5 years ago

gen.lib.rus.ec is working. It passed the test.

philborman commented 5 years ago

Good news. I will close this for now. Wonder if we can get an error code we can trap and give a better message.