-
Need additional feature like **removeTrailingQuestionMark** and **removeTrailingSlash** named removeTrailingHash.
Removes trailing hash sign ("#").
http://www.example.com/display# → http://www.ex…
-
Hallo,
I have tried to use the collector in combination with the AWS-Cloudsearch-Comitter.
And I have (still) 2 problems:
1) is there any chance to commit the crawled-results again to AWS in case…
-
Would be great to be able to run an action like:
. ./collector-http.sh -a checkcfg -c test.xml
of -a test or -a dry-run or -a check or -a configtest
That would just (syntactically) validate the con…
-
Some domains may be more important than others, and the link text may indicate that one link may be more important than others. I think the queue would also have to be changed so that it uses either…
-
Using version 3.0.0-SNAPSHOT
When executing command like this: `GET /index/type/_mapping/field/content` see this:
```
{
"index": {
"mappings": {
"type": {
"content": {
…
-
How can I implement an incremental crawler using Norconex HTTP Collector? I implemented a simple crawler with a max # of documents, and when I ran it again, it appeared to start exactly where it lef…
-
Command:
`collector-http.bat -a start -c examples/kbenp-web-elastic/kbenp-web-elastic-config.xml`
[kbenp-web-elastic-config.xml.txt](https://github.com/Norconex/committer-elasticsearch/files/649214…
-
[minimum-config.txt](https://github.com/Norconex/committer-solr/files/705232/minimum-config.txt)
I have created a new collection and modified the committer section in the attached config file to …
-
hi Norconex team,
not an issue at all, but are there any plans to add JCIFS support to the filesystem crawler?
It would be great, if the best crawler framework (it's not a joke! I've used many of the…
-
I am attempting to use the httpClientFactory in a complex-config.xml and I am not seeing any attempt by the crawler to authenticate at the server. I have wireshark running, and it is just doing the …