-
```
What steps will reproduce the problem?
1.
SLES 11.3 with slightly patched 3.16 kernel
Linux memcached9 3.16.3-4.1.100-default #1 SMP Thu Sep 18 06:32:16 UTC 2014
(d2bbe7f) x86_64 x86_64 x86_64 GN…
-
### Description
It's observed that custom [`DNSCACHE_ENABLED`](https://docs.scrapy.org/en/latest/topics/settings.html#dnscache-enabled) is not respected when specified as part of [`Spider.custom_…
-
### Validations
- [X] I believe this is a way to improve. I'll try to join the [Continue Discord](https://discord.gg/NWtdYexhMs) for questions
- [X] I'm not able to find an [open issue](https://githu…
-
So SEO was discussed in issue #101 for the Google crawler, but this is concerned with the Facebook and Twitter crawlers. They both do not interpret JavaScript, which makes it impossible to dynamically…
-
It should be possible to enhance the current implementations by parsing the results from the crawler into proper html. Right now the crawler only load the whole pages into a large string and extract p…
-
- Accept a argument from the user. Something like `url_list`
- Crawl only the urls provided by the users as an argument and nothing else.
-
Server is getting slammed with crawlers trying to find stuff + 404 errors
Gunicorn Server Hooks
http://stackoverflow.com/questions/40951861/how-to-use-variables-created-in-gunicorns-server-hooks
…
-
Implement the following for the Swift SDK.
## Service actions
Service actions can either be pulled out as individual functions or can be incorporated into the scenario, but each service action m…
-
Issue to track improvements/ideas for URL Scraping & Ingestion
- [ ] Add custom cookie support
- [ ] Instructions for adding custom browser-addons to the scraping browser
- [ ] Support for identi…
-
### ZIM(s) location
https://library.kiwix.org/viewer#theworldfactbook_en_all_2023-12/A/www.cia.gov/the-world-factbook/
### Recipe(s) URL
https://farm.openzim.org/recipes/CIAworldfactbook_en_all/edi…