scrapy-spider Search Results

1000+ results
for scrapy-spider

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

scrapy/scrapy #4182

Spider finshed normally despite an error in start_requests

This is a usability issue, although I'm not sure it's a good one. This is based on a real case we discovered with @whalebot-helmsman . Consider a spider which crawls a large list of URLs, and in it…

lopuhin updated 3 years ago
9
scrapy/scrapy #3163

Spider logging not optimal

Currently the output of a spider log looks like this: ```python >>> spider.logger.warning("test") >>> 2018-03-10 13:42:56 [spider_name_goes_here] WARNING: test ``` The problem with this that th…

thernstig updated 5 years ago
2
scrapy/scrapy #3354

Customize schemes for download handlers.

**scrapy.core.downloader.handlers.DownloadHandlers** gets the handler via scheme parsed from request.url. def download_request(self, request, spider): scheme = urlparse_cached(request)…

dingld updated 8 months ago
1
disinfoRG/ZeroScraper #35

Too many connections error

Too many connections error when doing `python3 execute_spider.py -d -site_id xxx` Could get rid of error if close mysql connection and restart. The error is suspected raised from too many unclosed …

andreawwenyi updated 4 years ago
2
scrapy/itemloaders #33

KeyError with the initialization of an Item Field defined wi…

Using Scrapy 1.5.0 I took a look at the FAQ section and nothing was relevant about it. Same for issues with keyword `KeyError` on github, Reddit, or GoogleGroups. As you can see below, it seems t…

Kiizuna067 updated 3 years ago
8
roycehaynes/scrapy-rabbitmq #5

Connection parameters not working

I have given the following in my scrapy settings.py file RABBITMQ_CONNECTION_PARAMETERS = {'host': 'amqp://username:password@rabbitmqserver', 'port':5672} But I am getting the following error: …

drprabhakar updated 9 years ago
14
dataabc/weibo-search #200

可以设置爬取当天到某一天的结果吗？通过定时执行。爬取最新的内容

可以设置爬取当天到某一天的结果吗？通过定时执行。爬取最新的内容

luolovehk updated 2 years ago
6
scrapy/scrapyd-client #15

Support for non-Basic Auth in scrapyd-deploy

My team is working on a set of scrapy spiders which we want to deploy to a scrapyd server. Our scrapyd server is configured to use an oauth2 proxy to authenticate traffic. On all of our requests to o…

jwebb-va updated 3 years ago
3
scrapy/scrapy #5166

HTTP cache should be for project but spider

After setting HTTPCACHE_ENABLED = True I find that the file storage is separated by spider name. So the web page is still reCrawled in another spider. That makes this feature a dup of duplicate url fi…

internalG updated 3 years ago
3
Gerapy/Gerapy #247

崔大，启用异步的TWISTED_REACTOR时候，部署就会报错

**Describe the bug** 启用异步的TWISTED_REACTOR时候，部署就会报错 **Traceback** Traceback (most recent call last): File "D:\anaconda\envs\scrapy\lib\site-packages\twisted\web\http.py", line 2369, in …

frshman updated 2 years ago
1

上一页 1...41 42 43 44 45 46 47...100 下一页

1000+ results for scrapy-spider

1000+ results
for scrapy-spider