Closed Ristellise closed 5 years ago
You are right. There are some limits on both PAPI and AAPI. I think this should be because both APIs are for mobile apps, and the web version doesn't use these APIs.
Decided to do an experiment... seems like I can scrape Pixiv directly [Via their website] and reach over the normal public limit of 20K illusts.
@Ristellise do you still need this API to authenticate, Im curious about your implementation as Im facing the same problem, thought in my case I need to login, in fact I just learn't python a few days ago in order to use this API.
EDIT: As of 1/11/2019, it doesnt work anymore. You still can hack your way through using a pythonic web browser though.
You still need to authenticate/signin to unlock all the images (As in, a regular user who searched on Pixiv will not be able to see some content).
But at this point it doesn't need PixivPy to actually search for all the images.
Below is the python script for authentication by a user signing in. stripped down to it's bare essentials.
Does not support recaptcha response. so if your forced by a recaptcha... sorry.
class loginManager:
def __init__(self, **kwargs):
self.username = kwargs.get("username")
self.password = kwargs.get("passw")
self.logintoken = None
self.session = None
def doLogin(self):
loginsession = requests.Session()
login = loginsession.get("https://accounts.pixiv.net/login")
loginhtml = BeautifulSoup(login.text,"html5lib")
data = {'pixiv_id': self.username, 'password': self.password, 'captcha': '', 'g_recaptcha_response': '',
'return_to': 'https://www.pixiv.net', 'lang': 'en', 'post_key': loginhtml.input['value'],
'source': "accounts", 'ref': ''}
url = "https://accounts.pixiv.net/api/login?lang=en"
response = loginsession.post(url, data=data)
print(response.text)
respj = response.json()
if not respj['error']:
if respj['body'] == {'success': {'return_to': 'https://www.pixiv.net'}}:
self.logintoken = response.cookies.get('PHPSESSID')
self.session = loginsession
def getSession(self):
if self.session is None:
self.doLogin()
return self.session
def getLogin(self):
if self.logintoken is None:
self.doLogin()
return self.logintoken
Thanks
Latest Pixiv website updates breaks the above code, now it requires recaptcha for all.
Wondered why my collections are only taking like 20k... So I decided to trial a bit
Public API Search limit is ~20K
App API Search limit: Offset 5000
In any case, how does pixiv web browser work since it can clearly bypass the limits imposed by the api?
To test APPAPI Limits:
Where
APPAPI = AppPixivAPI
Public API:
Result is:
srch.pagination.pages X per_page != total
EDIT: Might be related due to non-premium accounts only available to access 1000 pages. Since Error Code for PAPI is:
'1000ページまでしか取得できません。'