Closed Cyux07 closed 8 years ago
Hmm, I can't reproduce this error, what did you do when meet this error?
I use Chrome's f12 console to found out this url. Then I pack the headers include 'Host, Refer, Origin, User-Agent' and data 'pixiv_id, password, post-key, source', thenrequests.session()
and post it.
To say the least, I even cant open the url by myself (click it).isnt that weird?
So you're just asking a general question, there is no problem with this project, right?
This code works on my computer, hope this helps.
s = requests.Session()
data = {'pixiv_id': 'xx',
'password': 'xx',
'captcha': '',
'g_recaptcha_response': '',
'post_key': 'xx',
'source': 'pc'}
s.post('https://accounts.pixiv.net/api/login?lang=zh', data=data)
r=s.get('https://www.pixiv.net')
print(r.text)
eh,yes,you can visit pixiv whatever you are login or not, just with a lot of restrictions if you have not login(success).
this statement
s.post('https://accounts.pixiv.net/api/login?lang=zh', data=data)
would give a return value response 400.
It should return 200, I've tested it on my computer.
Maybe your post_key
is incorrect? You need to extract it from the login page's HTML source code.
self.login_header = {
'Host':'accounts.pixiv.net',
'Origin':'https://accounts.pixiv.net',
'Referer':'https://accounts.pixiv.net/login?lang=zh&source=pc&view_type=page&ref=wwwtop_accounts_index',
'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36'
,'X-Requested-With':'XMLHttpRequest'
,'Upgrade-Insecure-Requests':'1'
}
s = requests.session()
login_data = {"pixiv_id" : pixivId,
"password" : password,
'captcha': '',
'g_recaptcha_response': '',
"post_key" : postKey,
"source":source}
r = s.post(self.login_url, data = login_data, headers = self.login_header)
print(r)
console show <Response [400]>
Your code also returns <Response [200]>
If the requests session is already logged, and you post the login information again, you'll get 400
, I guess this is the cause of the problem.
but...how? the session would be a new instance at every time i restart the program.
I can't understand what you say, you can speak Chinese. Both of the code works properly, I don't know how you get the 400 error.
好吧...我是讲,每次运行这个程序都会创建一个新的session实例,不可能已在登录状态啊。更何况我之前也没登录成功过。 你是否有尝试检查主页的某些特定元素来确定是否‘登入态’? (例如:www.pixiv.net/search.php?word=overwatch ,登入和未登录顶端的话不一样(meta description),未登入态看不到收藏数且只有10页)
你运行过我给的示例代码吗? 它的输出就是已登陆状态的Pixiv主页HTML源代码。
s.post
第一次运行会返回200 再Post一次会返回400 这也是我能想到的唯一一个产生400的原因了。。。
试了,显示的是注册页(就同未登录时访问主站地址一样)。 可以看一下你的完整代码吗?
import re
import requests
s = requests.Session()
r = s.get('https://accounts.pixiv.net/login?lang=zh&source=pc&view_type=page&ref=wwwtop_accounts_index')
post_key = re.search(r'name="post_key" value="(\w+)"', r.text).group(1)
data = {'pixiv_id': 'xxx',
'password': 'xxx',
'captcha': '',
'g_recaptcha_response': '',
'post_key': post_key,
'source': 'pc'}
s.post('https://accounts.pixiv.net/api/login?lang=zh', data=data)
r = s.get('https://www.pixiv.net')
print(r.text)
post_key到底是干嘛的?我之前直接和cookies一样存了一个死的。肥肠感谢,这个获取key的思路。 然后我尝试去掉headers和cookies之后就可用了。(!!!) 为什么反而不要???一般网站不都是根据headers来防爬的吗?
Maybe your post_key is incorrect? You need to extract it from the login page's HTML source code.
post_key
是用来防止CSRF的
有时间看看这个吧 https://github.com/FredWe/How-To-Ask-Questions-The-Smart-Way/blob/master/README-zh_CN.md
受教了!看来cookies的概念还需再学习。 愿好运长伴你 : )
now click 'login' requests this url 'https://accounts.pixiv.net/api/login?lang=zh' and a 400 error would occur.Could you upgrade a new version or just give me some guides please? Thanks!