mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.16k stars 910 forks source link

Failure on exhentai links #37

Closed Bfgeshka closed 6 years ago

Bfgeshka commented 7 years ago

Happens now if credentials are provided. Verbose output.

Version is master head.

mikf commented 7 years ago

I can't reproduce this issue on my end, so their is probably something wrong with your login session. A few things that you could try to pinpoint this problem:

It would be good to know if any of these 3 methods work and if this only happens for this particular gallery or for all of them.

edit: gallery-dl is probably getting the sadpanda.jpg, which is why the charset detection is being used.

Bfgeshka commented 7 years ago
  1. Deleting cache file didn't work.

  2. Providing ipb_member_id and ipb_pass_hash only goes as anonymous download. Additional providing of username (no password) gives the same error as before.

  3. Anonymous downloads via g.e- do work, but it is not a desired result.

Account itself is valid and working btw.

mikf commented 7 years ago

The fact that anonymous downloads work at least means that the extractor itself is working and the issue is most likely tied to your account or your internet connection.

You also didn't provide your cookies in a way that gallery-dl can access them, as it doesn't fall back to anonymous mode if these two values are present. Make sure that this part in your config file looks something like this:

{
  "extractor": {
    "exhentai": {
      "cookies": {
        "ipb_member_id": "<your id>",
        "ipb_pass_hash": "<your password hash>"
      }
    }
  }
}

Another thing that you could try out would be to run the extractor test for exhentai: navigate to the test directory, run $ python3 test_extractors.py exhentai and post the output here.

I'm really not sure why it doesn't work for you, so all of these methods are just there to narrow down the problem and find the actual "culprit". Maybe you are using a VPN or proxy that is causing issues, or your account is too young ... but you said that it is working in your browser, so this couldn't be it either ...

Bfgeshka commented 7 years ago

My bad, I didn't wrap fields into "cookies": {}, I'll check it out in a couple of hours.

I do, indeed, use VPN, but again - in the same session it works in browser.

Bfgeshka commented 7 years ago

Well, using cookies method did not change anything, result is similar to just using username/password pair.

And tests are failed. There goes stderr output

mikf commented 7 years ago

The tests actually use their own account, which has been specifically registered for testing purposes, but the login step fails even for that. I did notice that custom HTTP headers only got set after the extractor attempts to log into e-hentai and fixed it, but that doesn't really matter if even manually setting cookies doesn't help (this completely skips the login-step).

Given that using exhentai works in your browser but not in gallery-dl and you are using the same account for both, my last guess would be that your browser and gallery-dl are somehow using different network configurations. One way that this can happen would be if HTTP_PROXY or HTTPS_PROXY are environment variables in your shell session that are set to valid http proxy servers. (These values get picked up and used by Python Requests and therefore gallery-dl as well)

Bfgeshka commented 7 years ago

This is really odd.

I do use vpn connection system-wide and I do not have additional proxies variables in my environment. Can it be because of different user-agent?

edit: no, it is not useragent. It is a problem with connection - after I've switched from japanese server to korean it works like a charm.

But why does this happen in a first place? In browser access to site itself and any gallery in it is fine regardless to current tunnel.

mikf commented 7 years ago

So you've found the cause of the problem, nice. And it isn't too surprising that exhentai is blocking access from Japanese IPs, given the content it is hosting.

This is most likely not why you can access this site in your browser and not with gallery-dl, but maybe your browser is ignoring your VPN settings (not really possible, right?) or it is using some extra proxy or another route in addition to your VPN tunnel, so that your IP does not originate from Japan? Or maybe the small difference in HTTP headers matters. If you want to play around with that, you can edit the headers that get sent to match the ones of your browser and see if that does anything: exhentai.py line 52-56

Bfgeshka commented 7 years ago

So, happents that it is not about a http header, it is cookie-related problem. After I've copied all the cookies from my browser session (there are not 2, there are 6 of them) it started working from japan too.

mikf commented 6 years ago

I've done a couple of tests with some public proxy servers from the US and Japan using my personal and my unittest accounts:

I've also reworked the login-procedure to work pretty much exactly like it would when using a browser, which has the advantage of producing all the necessary cookies, but it might not work when using a Japanese VPN. Please test this by, again, deleting your cache file and removing all cookie settings from your config (or by using -o cookies= -o cache.file= as cmdline options).

If logging in works, you should be able to access exhentai. If it doesn't, I'm going to revert this last commit and you would have to set the "igneous" cookie value (and only this one) in your config.

Bfgeshka commented 6 years ago

I'm afraid that even with this commit I've got a sadpanda. Cache is purged and cookies section is removed from config, used only user/pass pair.

mikf commented 6 years ago

Then I really don't know what else I could do to make this work for you. You are going to have to supply your cookies via your config-file, I'm afraid.

If login with username/password works - I'm going to assume this has always worked, right? - then it should be ok to not set "ipb_member_id" and "ipb_pass_hash" since these will be taken from the cached login session.