upbit / pixivpy

Pixiv API for Python
https://pypi.org/project/PixivPy3/#files
The Unlicense
1.8k stars 147 forks source link

pixiv_auth.py Is Gone / How to Get Refresh Token #344

Open SakiSakiSakiSakiSaki opened 8 months ago

SakiSakiSakiSakiSaki commented 8 months ago

@upbit I can't find pixiv_auth.py in your repo anywhere.

How are we supposed to get new refresh tokens?

Xdynix commented 8 months ago

Due to #158 reason, password login no longer exist. Please use api.auth(refresh_token=REFRESH_TOKEN) instead

To get refresh_token, see @ZipFile Pixiv OAuth Flow or OAuth with Selenium/ChromeDriver

Ref: top of README.md.

The script you mentioned is likely shared by some others in the original thread.

SakiSakiSakiSakiSaki commented 8 months ago

The script you mentioned is likely shared by some others in the original thread.

I just realized that pixiv_auth.py is a third party script from members of the community. It keeps returning the same refresh_token, is that normal?

I keep trying print(api.auth(refresh_token=NEW_REFRESH_TOKEN)) but I get the following error:

Exception has occurred: PixivError
[ERROR] auth() failed! check refresh_token.

How else am I supposed to get a new refresh token?

Xdynix commented 8 months ago

I have no experience using the script so cannot comment on it.

But for api.auth(refresh_token=refresh_token) it keeps returning the same refresh token to me.

SakiSakiSakiSakiSaki commented 8 months ago

I have no experience using the script so cannot comment on it.

But for api.auth(refresh_token=refresh_token) it keeps returning the same refresh token to me.

No worries, outside of that particular script, how are you retrieving new refresh_tokens?

Xdynix commented 8 months ago

I ... had never retrieved the refresh tokens... The token I'm using is a very old one, saved before #158 happened. :stuck_out_tongue:

I will probably go with the Selenium method if I have to retrieve it nowadays. Or use app like requestly to intercept the traffic of Pixiv client.

SakiSakiSakiSakiSaki commented 8 months ago

I ... had never retrieved the refresh tokens... The token I'm using is a very old one, saved before #158 happened. :stuck_out_tongue:

I will probably go with the Selenium method if I have to retrieve it nowadays. Or use app like requestly to intercept the traffic of Pixiv client.

Everything keeps returning the same refresh_token for me, I'm not sure what to do at this point.

upbit commented 8 months ago

I ... had never retrieved the refresh tokens... The token I'm using is a very old one, saved before #158 happened. 😛

I will probably go with the Selenium method if I have to retrieve it nowadays. Or use app like requestly to intercept the traffic of Pixiv client.

Indeed, the refresh_token typically has a very long expiration (may be several years?). Once obtained successfully, it generally doesn’t require any further attention. Just api.auth(refresh_token=refresh_token)

@SakiSakiSakiSakiSaki Do you know how long is an appropriate sleep time? 400 seconds doesn't seem enough.

I don’t have much experience with rate limit handling (it seems there’s a cap on the number of API calls within a certain period). You might want to open a new issue to see if anyone has experience with this.

xiyihan0 commented 7 months ago

I keep trying print(api.auth(refresh_token=NEW_REFRESH_TOKEN)) but I get the following error:

Exception has occurred: PixivError
[ERROR] auth() failed! check refresh_token.

Are you using the newest version of Pixivpy? I had encountered the similar problem so that I have to call the external python script(pixiv_auth.py) to refresh my ACCESS_TOKEN with my REFRESH_TOKEN. But now the api.auth method works fine on my program after updating Pixivpy to the newest version. I hope it will help you.

xiyihan0 commented 7 months ago

@SakiSakiSakiSakiSaki Do you know how long is an appropriate sleep time? 400 seconds doesn't seem enough.

I don’t have much experience with rate limit handling (it seems there’s a cap on the number of API calls within a certain period). You might want to open a new issue to see if anyone has experience with this.

I have scraped millions of novel and user data using this API, with 60+ accounts scraping parallelly. If you want to solve this problem, you could define a decorater like this:

def retry_on_error(func):
    def wrapper(*args, **kwargs):
        import time
        while True:
            try:
                result = func(*args, **kwargs)
            except PixivError as e: #api内部错误,比如网络请求失败
                print('PixivError detected:%s'%e)
                time.sleep(30)
                continue
            if 'error' not in result:
                return result
            print('Error detected:%s'%result['error'])
            if result['error']['user_message'] in ['Your access is currently restricted.', 'Page not found', 'Artist has made their work private.']: # 无法访问目标用户
                return result
            elif result['error']['message'] == 'Rate Limit' or result['error']['message'] in ['Internal Server Error', ]: #频率限制或短时的服务不可用
                print('Rate Limit, wait for 8s to try again')
                time.sleep(8)
            elif result['error']['message'] == 'Error occurred at the OAuth process. Please check your Access Token to fix this. Error Message: invalid_grant': #登录凭据失效,尝试使用api.auth()重新刷新
                args[0].auth()
                continue
            else:
                raise PixivError('Error detected:%s'%result['error']) #其他无法处理的错误
    return wrapper

and apply this to the original api:

    # 用户详情
    @retry_on_error
    def user_detail(
        self,
        user_id: int | str,
        filter: _FILTER = "for_ios",
        req_auth: bool = True,
    ) -> ParsedJson:
        url = "%s/v1/user/detail" % self.hosts
        params = {
            "user_id": user_id,
            "filter": filter,
        }
        r = self.no_auth_requests_call("GET", url, params=params, req_auth=req_auth)
        return self.parse_result(r)
SakiSakiSakiSakiSaki commented 7 months ago

I have scraped millions of novel and user data using this API, with 60+ accounts scraping parallelly. If you want to solve this problem, you could define a decorater like this:

I actually have been using a retry decorater this entire time:

def retry(num: int, retryable: Container[Type[Exception]] = None):
    # Makes function retryable.
    # :param num: Maximum execution number.
    # :param retryable: Optional, a collection of retryable exception classes.
    def decorator(func):
        @wraps(func)
        def decorated_func(*args, **kwargs):
            error_count = 0
            backoff = 1
            while True:
                try:
                    return func(*args, **kwargs)
                except Exception as ex:
                    if retryable is not None and ex.__class__ not in retryable:
                        raise
                    error_count += 1
                    if error_count >= num:
                        raise
                    time.sleep(backoff)
                    backoff *= 2

        return decorated_func

    return decorator

@retry(10)
def user_detail_with_retry(api, user_id):
    user_detail = api.user_detail(user_id)
    return user_detail

user_detail = user_detail_with_retry(api, user_id)
if "error" not in user_detail:
    break
if "OAuth" in user_detail["error"]["message"]:
    auto_refresh_token = get_new_tokens(api, auto_refresh_token, config)
if "Rate Limit" in user_detail["error"]["message"]:
    auto_refresh_token = get_new_tokens(api, auto_refresh_token, config)
    time.sleep(200)
elif any(message in user_detail["error"]["user_message"] for message in ("Your access is currently restricted.", "The creator has limited who can view this content")):
    break

How does your decorator differ from mine?

with 60+ accounts scraping parallelly.

Do you mean in the same script and rotating accounts or multiple asynchronous scripts?