-
I just noticed in the otto log files, that the next page extraction is not working as expected for some pages and the following error shows up:
```
2023-01-21 04:38:57 [scrapy.core.scraper] ERROR…
-
## Summary
With the `JOBDIR` setting the documentation states each spider should utilize its own directory, while there is nothing currently in place to automatically handle this as there is fo…
-
Currently, Docker / Kubernetes logs are used for logging. This is sometimes good enough, but in many situations not. These logs are often truncated at night (and potentially more often when grown to a…
-
When preparing a crawl with either Words, Titles or Authors the server returns the following error:
```
--- ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 134, in mayb…
-
I follow the _[Documentation](http://portia.readthedocs.io/en/latest/installation.html#developing-portia-using-docker)_ , and choose to install Docker. And I can run Portia using
`docker run -i -…
-
### Description
A spider inherits SitemapSpider parcing sites sitemaps, starting from `robots.txt`, has `JOBDIR` set.
I run it as a `CentOS 8.x` service with a unit file defined and it runs …
-
Scrapy reads the following environment variables into settings named after them without the `SCRAPY_` prefix:
- `SCRAPY_CHECK`
- `SCRAPY_PICKLED_SETTINGS_TO_OVERRIDE`
- `SCRAPY_PROJECT`
- `SCRAP…
-
Python 3.6, Scrapy 1.5, Twisted 17.9.0
I'm running multiple spiders in the same process per:
https://doc.scrapy.org/en/latest/topics/practices.html#running-multiple-spiders-in-the-same-process
…
-
There is a good deal of confusion among users when they encounter the following errors due to scrapyd not being installed. See [1](https://github.com/scrapinghub/portia/issues/786),[2](https://www.bo…
-
Utlimate goal is to be able to store, version and deploy rulesets between development, test and production environments. This story should implement the basics for this goal -- as soon as we can seria…