-
hi there
I am wondering if is it possible to call an external app .jar file from the collector while is crawling? I am trying to use a Machine learning model (SVN) in order to classify my docs.
than…
-
Hi,
I just ran the following boggus configuration (changed RenameTagger to CopyTagger, but forgot to change the internal HTML-tag from "rename" to "copy")
Would …
-
Would be great to be able to run an action like:
. ./collector-http.sh -a checkcfg -c test.xml
of -a test or -a dry-run or -a check or -a configtest
That would just (syntactically) validate the con…
-
Using version 3.0.0-SNAPSHOT
When executing command like this: `GET /index/type/_mapping/field/content` see this:
```
{
"index": {
"mappings": {
"type": {
"content": {
…
-
Program crashes if the reference does not start strictly with lowercased `http` and with enabled rule `encodeNonURICharacters`. It does not matter what the rules are activated, even if this is only on…
-
hi Norconex team,
not an issue at all, but are there any plans to add JCIFS support to the filesystem crawler?
It would be great, if the best crawler framework (it's not a joke! I've used many of the…
-
Hi, I'm running a crawler for days now. Apparently, a TimeOut occurred on one page and the crawler is stopped for more than 2 hours...
Is that the expected/normal behaviour? Isn't the problematic URL…
-
hi there
Is it possible for you to provide for me an example of the response Processor? please I'm totally lost on this
thanks a lot
-
hi there
I am wondering if is possible to integrate BoilerPipe (https://github.com/kohlschutter/boilerpipe) into this collector, would you please point me out some direction?
thanks
Angelo
-
Hello :). Thank you.
It seems my crawler is rejecting importing documents of 0-length pages.
But I want to collect them and show in the commit result.
How can I handle this?
By using Numer…