-
Hi,
I would like to purchase the warm cache plugin but have some questions.
I have a single instagram like page with a 1000+ high quality images. I am actually quite surprised how well this work…
-
## Summary
It should be possible to update some of the jobs settings while they are running. This would be specially useful for the settings related to crawling speed.
## Motivation
I have e…
-
```
What steps will reproduce the problem?
1. Create a web-page with a malformed URL (or a protocol like mailto:)
2. Run the crawler on said website.
3. Crash and burn at line 89 in WebURL.java - this…
-
all links =
[
"/",
"/mobile/separate_desktop",
"/mobile/desktop_with_AMP_as_mobile",
"/mobile/separate_desktop_with_different_h1",
"/mobile/separate_desktop_with_different_t…
-
```
What steps will reproduce the problem?
1. Create a web-page with a malformed URL (or a protocol like mailto:)
2. Run the crawler on said website.
3. Crash and burn at line 89 in WebURL.java - this…
-
```
What steps will reproduce the problem?
1. Create a web-page with a malformed URL (or a protocol like mailto:)
2. Run the crawler on said website.
3. Crash and burn at line 89 in WebURL.java - this…
-
```
What steps will reproduce the problem?
1. Create a web-page with a malformed URL (or a protocol like mailto:)
2. Run the crawler on said website.
3. Crash and burn at line 89 in WebURL.java - this…
-
**Feature request**
Hello, it's possible to add [capsolver](https://www.capsolver.com/) as another option for captcha solving?
Do you think others might benefit from this as well?
Yea, will be…
-
With browsertrix-crawler, a user can use `combineWARC` to write contextual information defined in the `warcinfo` property into the destination warc. When the warc is read, the fields defined in the pr…
-
As discussed in https://github.com/crwlrsoft/crawler/issues/99#issuecomment-1739671602 it would be nice to be able to use the [RetryErrorResponseHandler](https://github.com/crwlrsoft/crawler/blob/main…