-
Hello!
I hope you're doing well. I have a feature request that I believe would enhance the usability of your project: supporting proxy settings as a dictionary. Currently, when processing a proxy, …
-
The user-agent is the default "Scrapy/{Scrapy version} (+https://scrapy.org)". Very easy to detect that it's a crawler
The upstream [curl_cffi](https://github.com/yifeikong/curl_cffi) sets the appr…
-
Hi,
I apologize, I know this is probably the wrong place to post this, but I have the following issue: After implementing this in my spider, my average number of crawled pages per minute is now about…
-
Hello, please let us know if you have plans to release support new brosers versions chrome 120+ in you future updates?
-
Lets suppose this website:
afdb.org/en/
The standard requests getting 200 Status but Files are getting 403 because Media pipeline is not using impersonate handler.
-
### Current Behavior
When I use DO instances with my project I get a 407 error.
I do not get the same error with using IPRoyal proxies with scrapoxy.
I use `curl_cffi` to emulate a browser's T…
-
**Describe the bug**
Hello,
I have setup a starlette webserver that accepts a request and takes information from the headers to make a curl request. I started with the original curl-impersonate bu…
-
**BUG描述**
我这边用了scrapy_fingerprint这个包的源码做了一些改造并放到我的scrapy项目中,其核心是基于AsyncSession去实现的,当我开启并发去运行时候,内存会持续增长,我尝试了下,10并发运行1分钟后内存会激增到1GB左右,导致容器不断地重启,您看下是否有解决方案?
**完整的download handler代码实现**
```py
import …
-
```
File "/home/neiellcare/.cache/pypoetry/virtualenvs/rgdl-spiders--SESLHN9-py3.11/lib/python3.11/site-packages/scrapy_impersonate/handler.py", line 44, in _download_request
response = await se…
-
(Placeholder issue for msc thesis)
**Problem statement**
(Musicians/artists/creators) receive low compensation for publishing their content.
* [Typical cut of revenue:](https://www.theguardian…
Tim-W updated
3 years ago