-
Installed using conda on Windows 11 machine. Working through the tutorial and
response.css("title")
Gives this error:
File "src\\lxml\\parser.pxi", line 1806, in lxml.etree.HTMLParser.__i…
-
I'm setting the headers following way
```python
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'cache-control': 'no-cache',
...
}…
-
C:\Users\33721\Desktop\weibo-search-master>scrapy crawl search -s JOBDIR=crawls/search
2024-05-18 12:51:09 [scrapy.core.scraper] ERROR: Spider error processing (referer: https://s.weibo.com/weibo?…
-
I just create an example spider.
Chromium works well. but with the setup below. it's raise `NS_ERROR_PROXY_CONNECTION_REFUSED` from `playwright._impl._errors.Error: Page.goto: NS_ERROR_PROXY_CONNECTI…
-
### Description
On runs with default value of `DOWNLOAD_DELAY` setting (0) request sending rate.. limited only by CPU capabilities until number of sent requests will reach value `CONCURRENT_REQUEST…
-
## Cookie
### 持久化Cookie
- [scrapy-cookies](https://github.com/grammy-jiang/scrapy-cookies)
>A middleware of cookies persistence for Scrapy https://scrapy-cookies.readthedocs.io
-
Without proxy, cookie applied correctly. But when I use proxy (brightdata), then the cookie is not applied. Did I miss anything?
```
class ScrapyTest(scrapy.Spider):
name = 'scrapy test'
…
-
## the scrapy understand
Scrapy是一个应用程序框架,用于对网站进行爬行和提取结构化数据,这些结构化数据可用于各种有用的应用程序,如数据挖掘、信息处理或历史存档。
#### 创建项目
cmd运行`scrapy startproject tutorial`,新建一个项目
创建一个tutorial目录:
tutorial/
scrapy.cfg 部署配…
-
i get this error:
```
from scrapy.item import BaseItem
ImportError: cannot import name 'BaseItem' from 'scrapy.item' (/.virtualenvs/pyglobalenv/lib/python3.12/site-packages/scrapy/item.py)
``…
-
### Brand name
ODStore
### Wikidata ID
Q130492509
### Store finder url(s)
https://odstore.it/dove-siamo/
### Sample store page url
_No response_
### Countries?
IT
### Dif…