-
The status stable shows SSLHandshakeExceptions which don't occur in browser requests (f.e. for `https://dspace-clarin-it.ilc.cnr.it/repository/xmlui/bitstream/handle/20.500.11752/OPEN-531/derivationa…
-
ISSUE:
com.digitalpebble.stormcrawler.solr.persistence.SolrSpout
In case of query results for group is empty the IndexOutOfBoundsException is generated.
SOLUTION:
To fix validate the result be…
-
Hello, after the merge of my latest PR, the validation of the code format fails.
It results a BUILD FAILURE for `mvn clean install` of the latest SC SNAPSHOT version.
```
[ERROR] Failed to execute …
-
Hello all,
the following issue popped up during some experimenting with the URL Frontier and a derivate of the SC: When the URL Frontier service is killed and restarted, the serving of new URLs to th…
-
Hello all,
while crawling, we ran into a politeness issue and we suppose that its cause is that there was apparently a Connection Timeout when trying to fetch the `robots.txt`. We suppose that as a c…
-
[ ] Bug report
storm-crawler-solr 2.8
Class: com.digitalpebble.stormcrawler.solr.persistence.SolrSpout
Method: populateBuffer()
Solr: Solr 8,8.2 (cloud mode)
**Issue**: Collapse and Expand Res…
-
# WHAT
If i have a set of 5 remote phantomjs instances running and if one of them is not working properly this line of code will throw error and my crawler will misbehave and show exceptions in log.…
-
The pom files in the archetype should be update to java 11 version rather than jdk 8
https://github.com/DigitalPebble/storm-crawler/blob/master/external/elasticsearch/archetype/src/main/resources/arc…
-
Similar to the one using Elasticsearch but should save users the trouble of having to rename parameters and components if copying the ES one.
-
I describes the issue at https://github.com/DigitalPebble/storm-crawler/issues/981 the code revealing the problems (+ Unit-Test) can be found there:
- https://github.com/FelixEngl/storm-crawler/blob/…