-
I'm tested redirection with https://httpbin.davecheney.com/redirect/3. I add to `` config :
``` xml
10
200,302
```
I get:
> INFO [AbstractC…
-
I want to modify the content of the document with Javascript code.
For this I use [ScriptTagge](https://norconex.com/collectors/importer/latest/apidocs/index.html?com/norconex/importer/handler/tagger…
-
In `createTableSQL` Tag, I can set the fields that are available for the tagger. I.E. the fields that would go here:
``` xml
title,description,document.reference,google-site-verification
/t…
-
Hello,
I have added a document pre-parser to remove any content that appears between
menus but cannot get the function to work as a regex.
If I add "Skip" "content" then it will "Skip thi…
-
I am trying to split PST files (contain entire mailboxes) into its elements: emails, attachments, contacts, calendar entries, etc.
Norconex is able to read the PST file, however it returns the enti…
-
Hello,
I wanted to set the title to a specific div, but noticed that using the DOMTagger with overwrite="true" actually appends the new value to the existing value in title field.
```xml
…
-
Hello,
I am trying to use a custom document filter in the Importer pipeline, the Pre-process document stage. My document filter implements only the IDocumentFilter interface.
When the pipeline is …
-
Hi we have a URLs in a .xml site map where in some date data need to be also crawled along with the URLs from the site map.
![image](https://user-images.githubusercontent.com/29800957/34831885-49264…
-
Hi,
Is there a way to add a specific value to each metadata `collector.referenced-urls` values?
I want to retrieve the HTML `class` attribut of all referenced links from one document.
the resul…
-
Hi Pascal,
we have a long crawling process which takes approx. 4 days to complete a single crawl run. Today we got the following error message and the crawler stoped crawling `Crawler stopped.`
…