-
Scrapyd has a basic web interface. It would be useful to have something here too, for easy access.
Granted, you can already inspect everything with `docker` or `kubectl`, but a basic web interface st…
-
We already have `listprojects`, `listspiders`, `schedule`, `addversion`.
https://scrapyd.readthedocs.io/en/stable/api.html
Idempotent:
* [ ] daemonstatus
* [ ] listversions
* [ ] listjobs
…
-
I want to set priority for my spider.
I done when set on scrapyd:
curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d priority=1
How can i do like that on python-…
-
Hopefully it's temporary, as we need their data :)
Sentry Issue: [REGISTRY-KINGFISHER-COLLECT-1](https://open-contracting-partnership.sentry.io/issues/2823958328/?referrer=github_integration)
```
Ga…
-
Regarding distributed setup, this is what I propose. For this setup, we will need scrapyd, rabbitmq, and a distributed file system (HDFS/seaweedfs)
(1) Adding nodes: whatever node we wanna add, we …
-
My team is working on a set of scrapy spiders which we want to deploy to a scrapyd server. Our scrapyd server is configured to use an oauth2 proxy to authenticate traffic.
On all of our requests to o…
-
Hi all:
since the scrapyd server has add the new api of daemonstatus.json
here is the code
https://github.com/scrapy/scrapyd/blob/master/scrapyd/webservice.py
here is the doc
https://scrapyd.re…
-
A small number of error-cases is handled (i.e. an error API response is returned).
Add a default error handler that returns what scrapyd would return on an error.
And handle more cases with helpful …
-
http://www.weixinxi.wang/open/extract.html 文本密度算法提取正文http://www.cnblogs.com/rwxwsblog/p/4575894.html 爬虫防止被ban
-
Hi, according to the following links
[https://doc.scrapy.org/en/latest/topics/spiders.html#spiderargs](url)
[https://scrapyd.readthedocs.io/en/stable/api.html#schedule-json](url)
Params can be …