-
- [x] I have checked the existing issues to avoid duplicates
- [x] I have redacted any info hashes and content metadata from any logs or screenshots attached to this issue
### Is your feature requ…
-
Chrome [recently added in v101](https://developer.chrome.com/blog/new-in-devtools-101/#recorder) a new framework-agnostic [JSON user script export](https://developer.chrome.com/docs/devtools/recorder/…
-
I would like to make an app with scheduled data source azure functions that queue up the data for later processing. I would like each function to scrape data, and then upload the results to a blob wit…
-
```
What steps will reproduce the problem?
1. Run the Basic Crawler with RobotServer enabled
2. Have "addeasy.netfirms.com" as the seed
What is the expected output? What do you see instead?
Expectati…
-
### Describe the bug
I'm seeing this error a lot in the logs when crawling `testphp.vulnweb.com` with the AJAX spider and Chrome.
```
ERROR: 'Namespace for prefix 'xlink' has not been declared…
-
It would be nice if I could parse dynamic endpoints(in SPIDER_SETTINGS) like: 'endpoint': 'crawl/'
-
Sometimes I feel like scrapy is missing per request delays. Any reasons why they weren't implemented?
Where can per request delays be used:
- to add exponential backoff for the retry request
- to add…
-
Tried searching for a way to stop triggering google analytics on every scenario that gets run () having three different viewports also triggers a visitor for each test). This is probably an easy thing…
-
```
What steps will reproduce the problem?
1. Create a web-page with a malformed URL (or a protocol like mailto:)
2. Run the crawler on said website.
3. Crash and burn at line 89 in WebURL.java - this…
-
Hello,
I would like some advice.
I'm building a web crawler for different usages, so I put the generic code in a library. Basically it is a task queue that will fetch web page and give them to …