-
When sending a [Slack message](https://github.com/scrapinghub/spidermon/blob/master/spidermon/contrib/actions/slack/__init__.py#L111) we don't have an easy way to provide extra arguments to [postMessa…
-
Some [dependency versions](https://github.com/pangaea-data-publisher/fuji/blob/master/pyproject.toml) are quite dated. Let's update those.
## Versions
```console
$ python -m pip list --outdated
…
-
I figured that the import of this library is very slow (500-1000ms on my system).
Given that the library probably less complex than e.g. `numpy` or similarly large packages: Do you think the import t…
-
### What happened?
When I set SE_NODE_SESSION_TIMEOUT to 600 , It still closed after 300 second.
### Command used to start Selenium Grid with Docker
My docker-compose.yaml
```shell
version…
-
Because it has now got features like f-strings involved in the code. Examples:
- https://github.com/scrapinghub/hcf-backend/blob/67f066516f02e548ec4a5c8c7dab6cc876304ed5/hcf_backend/utils/hcfpal.py#L…
-
1. Started Portia with Docker: `docker run -v ~/portia_projects:/app/data/projects:rw -p 9001:9001 scrapinghub/portia`
2. Configured some test-spider
3. Tried to actually run the spider (from the UI…
-
Instructions on [Official documentation](https://frontera.readthedocs.io/en/latest/topics/scrapy-integration.html) about _Using the Frontera_ with Scrapy throws exception with `CrawlSpider`.
spider…
-
现在的代理网站不是很多,这样可用的代理IP就很少。我也尝试过扫描的方法,但是效率比较低
-
## Background
HCFCrawlManager's main workflow loop checks running or pending jobs of the same spider to determine which slots are available.
```python
def workflow_loop(self):
avai…
-
## Issue
HCFPalScript doesn't consider different project ids passed via `--project-id`.
This is probably happening because the `HCFPal()` instance created inside `HCFPalScript.__init__()` isn't …