-
Provides an ability to adjust the currently hardcoded http request timeouts. Also need to have corresponding fields in the custom HTTP profile JSON file definitions, `request_wait` and `crawl_request_…
-
All of the search results are categorized under `docs`, such as https://docs.taichi.graphics/search?q=field or https://docs.taichi.graphics/search?q=high%20performance%20computing, which is confusing.…
-
Would be nice to have an OnFinally handler that will be executed after both OnError and OnScraped handlers
-
For example, I was thinking of using this library to crawl a single site for pages.
This library looks great by the way - much higher quality than any of the other existing crawler libraries I've i…
-
Hi.
I'm wondering if it's possible to use the link checker example to just check for valid links, and maybe store them in a JSON, or CSV file instead of creating binary files and index.html files i…
-
Hi,
iam trying to set up anew norconex connector for a page.
Here iam having trouble with reading URLs in the page like those in teh index bar
![image](https://user-images.githubusercontent.com/29…
-
Hi,
### Issue details
Fetching articles from nasa.gov fails.
### Environment
* wallabag.it
* f43.me
### Steps to reproduce/test case
* Add [this article](https://www.nasa.gov/featur…
-
Here at the September Unidata Users Committee meeting, Unidata Director Mohan listed "Data Discoverability" as a major potential theme for the 2016 Strategic Plan. I agree this would be a great thin…
-
I'm attempting to use interactive profile creation. I'm using the snippet suggested in the README verbatim:
```
docker run -p 6080:6080 -p 9223:9223 -v $PWD/crawls/profiles:/crawls/profiles/ -it web…
-
Submitted by @stevygee: https://gist.github.com/stevygee/ed18db68a247c2cd4fb7680d39d41502
consider whether in Advanced Detection add-on or WPML specific add-on if other functionality makes sense to…