dataabc / weiboSpider

新浪微博爬虫,用python爬取新浪微博数据
8.15k stars 1.95k forks source link

原始图片url无法访问(403 Forbidden) #503

Closed jerrylaikr closed 1 year ago

jerrylaikr commented 1 year ago

爬取到的原始图片url无法直接访问,提示403 Forbidden。在浏览器地址栏输入图片url得到的效果如下: image

大概是最近一个月出现的问题,之前可以正常显示。 以前如果同时加载很多爬取的图片会被限制访问,数量少的时候基本都不会有问题。此外,之前在浏览器地址栏输入图片url是可以正常访问的,在二次开发时发现如果不禁用referer会被限制访问,但是现在禁用referer和地址栏直接输入都不行。

我怀疑是和referer policy有关系。以 https://wx3.sinaimg.cn/mw690/00729TsFly1h9r9dzz99dj30s10yd411.jpg 为例,地址栏输入访问时的request headers不包含referer,结果报错403。而利用微博网页端访问时request headers中包含 referer: https://weibo.com/ 可以正常加载出图片。

dataabc commented 1 year ago

感谢反馈。确实可能是官方调整的原因,目前还没有解决方案,后续再看看。

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 1 year ago

Closing as stale, please reopen if you'd like to work on this further.