-
> Versions: lxml 5.2.1.0, libxml2 2.11.7, cssselect 1.2.0, parsel 1.9.1, w3lib 2.1.2, Twisted 24.3.0, Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)], pyOpenSSL…
-
I got some unusual spider that was getting allowed domains like this
```python
@property
def allowed_domains(self):
# 2nd domain generated dynamically
return ['domain1com', self.gener…
-
Scheduled jobs are not run in FIFO+priority order.
Instead, there are multiple queues
that are also arranged in a queue-like fashion
but not round-robin or anything,
just an "arbitrary but constan…
-
你好,您的项目虽然说每个视频用一个线程去抓取,但是每个视频,只抓取到一部分二进制文件后,便出现了异常,有什么好的办法可以将每个视频都完整的抓取下来吗。部分异常信息如下:
```
Exception` in thread Thread-47:
Traceback (most recent call last):
File "/System/Library/Frameworks/Pyth…
-
I met a interesting failure when I did a unittest about the method `process_spider_exception` of the `spider middleware`:
In my project, this method returns a iterable (list) of request objects, wh…
-
mabel@MabeldeMacBook-Pro download_git_issues-main % python3 main.py issues
开始爬取issues list: https://github.com/dotnet/runtime/issues
Traceback (most recent call last):
File "/Library/Frameworks/P…
-
Hello.
for some reason this anime. can not be downloaded.
sell this bug.
all other anime without problems.
name: https://www.crunchyroll.com/es/so-im-a-spider-so-what
ERROR:
Booting up...
N…
SAOKT updated
3 years ago
-
I ran "scrapy crawl fb -a email="barackobama@gmail.com" -a password="10wnyu31" -a page="DonaldTrump" -a date="2018-01-01" -a lang="it" -o Trump.csv" at cm but it didn't work
This error : " File "/Use…
-
**描述该问题**
Traceback (most recent call last):
File "test.py", line 8, in
pprint(spider.search_news(query="卡塔尔", pn=2).plain)
File "/usr/local/lib/python3.8/dist-packages/baiduspider/__init…
-
Several Scrapinghub API endpoints accept or return timestamps, currently as UNIX timestamp in milliseconds.
It would be great to have those values as `datetime.datetime` objects in the results so t…