Nandaka / PixivUtil2

Download images from Pixiv and more!
http://nandaka.devnull.zone/
BSD 2-Clause "Simplified" License
2.4k stars 254 forks source link

[pixivFANBOX] API changes now prevent multi-page downloads (invalid URL concatenation) #494

Closed AgentThirteen closed 5 years ago

AgentThirteen commented 5 years ago

Prerequisites

Description

API changes to pixivFANBOX now seem to prevent multi-page download. PixivUtil2 attempts to perform a clearly incorrect concatenation after page 1.

The issue might be in PixivBrowserFactory.py, line 587 (else branch in fanboxGetPostsFromArtist): if next_url is None or next_url == "": url = "https://www.pixiv.net/ajax/fanbox/creator?userId={0}".format(artist_id) else: url = "https://www.pixiv.net" + next_url

Dump Processing 3318706, page 2 Getting posts from https://www.pixiv.nethttps://fanbox.pixiv.net/api/post.listCreator?userId=3318706&maxPublishedDatetime=2018-12-23%2022%3A15%3A41&maxId=235672&limit=10 1 2 3 4 1 2 3 4

Steps to Reproduce

  1. f1 or f2
  2. Set Max Page to 2 or higher
  3. As soon as page 2 is reached, the script will attempt to process an URL it cannot get to and try to connect to it until config.ini parameters tell it to give up or they force-abort the script

Expected behavior: Get content from all posts

Actual behavior: Get content from page 1 then get stuck in a loop trying to connect to an invalid URL

Versions

v20190526 and any older version that supports pixivFANBOX multi-page download using this method

You can get this information from executing PixivUtil2.py --help.

Nandaka commented 5 years ago

looks like the given next url is wrong. image

Trying to open https://fanbox.pixiv.net/api/post.listCreator?userId=3318706&maxPublishedDatetime=2018-12-23%2022%3A15%3A41&maxId=235672&limit=10 will return general error. image

Nandaka commented 5 years ago

try https://github.com/Nandaka/PixivUtil2/releases/tag/v20190528

AgentThirteen commented 5 years ago

Thanks for the quick reply and fix Nandaka. The replace workaround apparently did the trick.

This no longer seems to perform a sanity check on Max Page being the last page though so beware! :)

Edit: Actually, it appears to loop on page 2 now. Oops.

Nandaka commented 5 years ago

looks like it return the same next url, so bugs on server side?

Can you open the fanbox page and see it does load the next page? I don't have a fanbox account to check.

ee092884 commented 5 years ago

bandicam 2019-05-29 00-36-03-309.zip I met a problem... Also download the same author I can't load the movie before. Now it will be infinite loops... After 1:35 seconds in the movie, the number of pages has gone to 100 or more. It has not ended yet.

ee092884 commented 5 years ago

I tested other authors and had the same problem. Some will only download 3 images and will enter the infinite loop...

AgentThirteen commented 5 years ago

@Nandaka Sure, checking right now. I should have linked to a public account in the dump to make things easier for you.

Unfortunately, it doesn't look like it's a server-side issue. Hopefully this network trace will help you. Here are the XML HTTP requests for uid 14694404 (5 pages, 54 posts). It loads fine in Firefox. GET XHR https://www.pixiv.net/ajax/fanbox/creator?userId=14694404 [HTTP/2.0 200 OK 562ms] GET XHR https://fanbox.pixiv.net/api/post.listCreator?userId=14694404&maxPublishedDatetime=2019-03-13%2003%3A13%3A17&maxId=315820&limit=10 [HTTP/2.0 200 OK 297ms] GET XHR https://fanbox.pixiv.net/api/post.listCreator?userId=14694404&maxPublishedDatetime=2019-02-05%2004%3A33%3A21&maxId=279033&limit=10 [HTTP/2.0 200 OK 297ms] GET XHR https://fanbox.pixiv.net/api/post.listCreator?userId=14694404&maxPublishedDatetime=2018-11-13%2016%3A01%3A23&maxId=200278&limit=10 [HTTP/2.0 200 OK 297ms] GET XHR https://fanbox.pixiv.net/api/post.listCreator?userId=14694404&maxPublishedDatetime=2018-09-13%2003%3A36%3A09&maxId=150238&limit=10 [HTTP/2.0 200 OK 297ms] GET XHR https://fanbox.pixiv.net/api/post.listCreator?userId=14694404&maxPublishedDatetime=2018-05-16%2007%3A57%3A07&maxId=46379&limit=10 [HTTP/2.0 200 OK 298ms]

Maybe something's off with the maxPublishedDatetime and maxId parameters when attempting to replace https://fanbox.pixiv.net/api/post.listCreator with /ajax/fanbox/creator (to form a proper URL with https://www.pixiv.net if I got that right) and the script keeps using the same ones instead of fetching the next ones...?

ee092884 commented 5 years ago

bandicam 2019-05-29 01-23-19-546.zip *Video is a comparison between 20190526 and 20190528 Use back 20190526 version will encounter 1234 1234 After that the program is automatically shut down directly. Is it a question of fanbox or is my personal question??

Nandaka commented 5 years ago

if I got that right) and the script keeps using the same ones instead of fetching the next ones...?

No, the nextUrl node give the same url from https://www.pixiv.net/ajax/fanbox/creator from server and https://fanbox.pixiv.net/api/post.listCreator always give error in my browser.

AgentThirteen commented 5 years ago

Oh right. I'm trying to figure out what exactly changed since the authentication process doesn't seem to be different (if it were the script wouldn't get anything at all).

It would be very helpful if someone could test it on a public fanbox page with more than 20 posts to try and get it working after page 2.

Nandaka commented 5 years ago

Try https://github.com/Nandaka/PixivUtil2/releases/tag/v20190529

Nandaka commented 5 years ago

looks like need to customize the request header, should be working on that version, at least tested with my account for public post.

AgentThirteen commented 5 years ago

That was it. I have tried both f1 and f2. Everything is now working as intended again.

Thanks a bunch for your hard work Nandaka!