-
Hello,
Thank your for your fantastic project. We are facing a really hard to solve bug while running scapy inside celery task. Sometimes we get this error:
```
Unhandled Error
Traceback (most re…
-
Petit test de savoir avec llama3 ce qu'il y a à étudier en Python.
-
Currently, the crawler keeps all records that are read in whether they get indexed or not. The crawler should operate exclusively where it only keeps data that indexes to a comid.
When a crawl fin…
-
## Description
The call for `make test` fails within the docker step for `pip install--upgrade pip`. TravisCI should be modified to utilize the steps in the `Makefile` so the testing environment i…
-
LogCounterHandler increases crawler log_count stats for each record, but it should only increase them for logs from the crawler it is created by. This is an issue if you're running several Crawlers in…
kmike updated
2 months ago
-
It would be possible to use regex to try to find anchors, CSS, and JS, but this could end up being very messy. I'd suggest using an HTML-parsing library but, since Python is super new to me, I don't k…
nwtn updated
10 years ago
-
### Description
`scrapy.shell.inspect_response` does not work with the `asyncio` reactor when using the `ipython` shell
### Steps to Reproduce
1. Create a spider with the following contents:
…
-
monitor.sh does not properly restart the API fetcher.
Thankfully, this code is great easy! From the explanation from SO (http://stackoverflow.com/questions/696839/how-do-i-write-a-bash-script-to-resta…
-
I tried to add:
```
response = yield from asyncio.wait_for(
self.session.get(url, allow_redirects=False), 20)
```
instead of
```
response = yield from self.…
-
There are a few cases where the crawler was not able to take screenshots. We should figure out why and try to fix any issue that we notice.
The files under `data/` are in the format `WEBCOMPAT-ID_E…