upbit / pixivpy

Pixiv API for Python
https://pypi.org/project/PixivPy3/#files
The Unlicense
1.79k stars 149 forks source link

"'NoneType' object is not iterable" for illust_ranking(**next_qs, req_auth=True) #278

Closed tudubucket closed 1 year ago

tudubucket commented 1 year ago

Hi, I'm currently use this package for automatically download images from pixiv, down below is a part of my code:

downloaded = 0
logger.info("Starting ...")
stop_check = True
for entry in entries:
    next_qs = {"mode": f"{str(entry).replace('weekly_r18', 'week_r18').replace('daily_r18', 'day_r18')}"}
    i = 0
    while next_qs:
        try:
            if i > 3: break
            i = i + 1
            try:
                json_result = api.illust_ranking(**next_qs, req_auth=True)
            except Exception as e:
                logger.error(f"An error occurred while loading result for {entry}: {str(e)}")
                traceback.print_exc()
                time.sleep(5)
                refresh(PIXIV_REFRESH_TOKEN)
                continue
            for illust in json_result.illusts:
                if 'manga' in str(illust.tags): continue
                if illust.type == 'illust' and illust.page_count <= 15:
                    if str(illust.page_count) == '1':
                        try:
                            status = None
                            if 'r18' in entry:   status = api.download(path=f"art/r18/{str(entry).replace('_r18', '')}/", url=illust.meta_single_page.original_image_url, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}.jpg'))
                            if entry == 'day':   status = api.download(path=f"art/regular/daily/", url=illust.meta_single_page.original_image_url, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}.jpg'))
                            if entry == 'week':  status = api.download(path=f"art/regular/weekly/", url=illust.meta_single_page.original_image_url, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}.jpg'))
                            if entry == 'month': status = api.download(path=f"art/regular/monthly/", url=illust.meta_single_page.original_image_url, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}.jpg'))
                            # api.download(path="art/author_avatar", url=illust.user.profile_image_urls.medium, name=f'{illust.user.id}.jpg')
                            if status == True: 
                                logger.info(f"Successful downloaded image: {entry} {illust.id}")
                                downloaded += 1
                        except Exception as e:
                            traceback.print_exc()
                            logger.error(f"An error occurred in {entry} ({str(entry).replace('weekly_r18', 'week_r18').replace('daily_r18', 'day_r18')}, 1 page): " + str(e))
                            continue
                    else:
                        try:
                            download_path = ''
                            if 'r18' in entry:   
                                if illust.page_count > 3: continue
                                download_path = f"art/r18/{str(entry).replace('_r18', '')}/"
                            if entry == 'day':   download_path = "art/regular/daily/"
                            if entry == 'week':  download_path = "art/regular/weekly/"
                            if entry == 'month': download_path = "art/regular/monthly/"
                            counter = 1
                            for image in illust.meta_pages:
                                status = None
                                status = api.download(path=download_path, url=image.image_urls.original, name=sanitize_file_name(f'{illust.id}---{illust.title}---{illust.user.id}---{illust.user.account}---{counter}.jpg'))
                                if status == True: 
                                    logger.info(f"Successful downloaded image: {entry} {illust.id} | Page {counter}")
                                    downloaded += 1
                                counter += 1
                        except Exception as e:
                            logger.error(f"An error occurred in {entry} ({str(entry).replace('weekly_r18', 'week_r18').replace('daily_r18', 'day_r18')}, {illust.page_count} pages): " + str(e))
                            traceback.print_exc()
                            continue
            if json_result.next_url is not None:
                next_qs = api.parse_qs(json_result.next_url)
            else: break
        except Exception as e:
            logger.error(f"An error occurred while loading page for {entry}: {str(e)}")
            traceback.print_exc()
            time.sleep(5)
            refresh(PIXIV_REFRESH_TOKEN)
            continue
logger.info(f'Task completed download {downloaded} illustration.')

but these is a problem: Sometime, the illust_ranking(**next_qs, req_auth=True) return None, or someting, that my code cannot working with:

image

I am wondering, is that I've caused a rate-limit from pixiv, pr someing I've done wrong?

Xdynix commented 1 year ago

It's hard to say what is the exact reason without seeing the whole response content. You can try to add more logging statements, especially printing out the json_response. I don't see any explicit concern on your code, I would use the same way to fetch the illlustrations.

tudubucket commented 1 year ago

Thanks for your really fast response! I've searched about same error in this repo, but none found. So, your response seems like promise that the code will always work as expected without error. This is actually good, cause i was thinking about rate-limit and imagine no solution to replace this is just so sad.

I will look into my code and get more information when it return an error.

Actually, my code is running in a specific time of every day, usually 10PM UTC. When i modified the code to make it run right after my server start, it will actually work again. If you can, please hold this issue for 1 - 2 days, I will get more information if possible. If there is a problem in my code, I will post it here with a solution and close this issue.

Anyway, I did all of this on a virtual private server, will this affect any part of the code?

Xdynix commented 1 year ago

Using a VPS should make no difference. Unless the IP of the data center is blocked by Pixiv. But in this case you should always get an error response, and it doesn't look like tho.

tudubucket commented 1 year ago

I'm back after 1 day of looking for error, this is what it is telling me:

image

I have a refresh token function in main function start & command exception:

def refresh(refresh_token):
    response = requests.post(
        AUTH_TOKEN_URL,
        data={
            "client_id": CLIENT_ID,
            "client_secret": CLIENT_SECRET,
            "grant_type": "refresh_token",
            "include_policy": "true",
            "refresh_token": refresh_token,
        },
        headers={"User-Agent": USER_AGENT},
    )
    data = response.json()
    try:
        # a = data.get("expires_in", 0)
        logger.info(f'Successfully renew pixiv token with {data.get("expires_in", 0)} seconds remaining.')
    except KeyError:
        logger.warning(f'Unable to renew pixiv token...')

Any idea about this?

Xdynix commented 1 year ago

In your refresh() I didn't see you pass the refreshed access token back to the API instance, so it will probably still using the expired one. You can use api.auth() (without any argument) to trigger the refersh, instead of writting it by yourself.

TL;DR: Replace refresh(PIXIV_REFRESH_TOKEN) with api.auth().


Personally, I would record the expiration time of the current access token after each authentication (api.auth()). Then before each request is sent, check whether the access token is close to expiration (eg, within 2 minutes), and if so, refresh the access token first. This way I don't need to wait until I encounter an error to refresh the access token.

tudubucket commented 1 year ago

Thanks for your response! I will list everything i've changed in my code, to make sure that i didn't make anything wrong:

I will try this out. If everything work fine, i will close this issue myself. Ortherwise, i will call you again :D

katresars commented 1 year ago

proxies?

upbit commented 1 year ago

I'm back after 1 day of looking for error, this is what it is telling me:

! [image] (https:user-images.githubusercontent.com/106295287/263436737-5e190541-2b69-4cf8-9649-ee2889289be2.png)

From the return JSON, it seems that the Access Token expires after execution. Generating a new bearer token using auth(refresh_token=LAST_TOKEN) should solve the issue.

tudubucket commented 1 year ago

Thanks for your response, but how do i get the last token?

upbit commented 1 year ago

See https://github.com/upbit/pixivpy/issues/158#issuecomment-778919084 to get your refresh_token, save it into file or a const like demo.py.

refresh_token can be used for a long period of time, and it rarely needs to be updated.

Xdynix commented 1 year ago

Thanks for your response, but how do i get the last token?

You can also use last_refresh_token = api.refresh_token. But calling api.auth() will automatically do the same things (use last time's refresh token to get new access token).

tudubucket commented 1 year ago

Kinda confused here, but ima try out api.auth() first xD

tudubucket commented 1 year ago

Seems like call an api.auth() in every download instance solved this problem. But i will look after it for about more 24 hours to make sure it will automatically renew token. If the problem is no more exist after that time, I will close this issue

Down below is a download instance that autoamtically renew the token following by above fix - api.auth(), no more errors: image

katresars commented 1 year ago

setting proxies to bypass cloudfare may solve the auth problem.

tudubucket commented 1 year ago

Call an api.auth() in every download instance solved this problem, tested in 2 days.