-
The documentation states to use `URLStatusCrawlerEventListener` like the following:
```xml
404
/report/path/
brokenLinks
```
This is great if I wanted to output these…
-
For common services like database, dataverse, workload managers, they can be automatically added to the Compose agent as available components.
To use this concept, it needs to be defined a special co…
-
**ISSUE**
We're running ECS clusters in rails default environment RAILS_ENV=development. For performance and security reasons, we should be running in rails production mode.
**ACCEPANCE**
- [x] Rail…
-
by the end of this year i intend to start writing an integration for solr's [analytics component](https://lucene.apache.org/solr/guide/7_2/analytics.html).
ideally this component gets integrated in…
-
Hello,
Our Solr instance employs basic authentication. This means that all clients need to specify valid credentials for communication. Further details can be found at https://lucene.apache.org/sol…
-
After overcoming my previous bad certs issues (https://github.com/Norconex/committer-solr/issues/13) I was still stuck with the fact that I couldn't commit docs to a SolrCloud cluster that lives behin…
-
We have two environments using the HTTP collector, a production and QA. In our production environment everything is working fine, the same configurations in our QA environment are causing special char…
-
Let's say I crawl my site once, then notice that I crawled a section that I shouldn't have. So I add some exclusion lines to my reference filters.
When I crawl my site again, will the HTTP Collecto…
-
Hi there - does Norconex support incremental web crawls that would:
- Download headers first using the PROPFIND HTTP call to find the last modified date
- Lookup the date / time of the previous cr…
-
Hello Pascal,
I am seeing multiple entries for the exact same URLs and each time I re-index the contents the crawler adds the same entry one more time. Please see the examples below:
"_langu…