-
hello Pascal,
looks like an issue with deletion: docs don't get deleted from ES when the `sourceReferenceField ` parameter is used:
```
id
```
BTW, when "id" is equal to "document.reference", t…
-
Hi,
I am using GKE to run the norconex crawler with this plugin.
There are about 6 crawler job with a Mongodb for datastore. The jobs show below.
```
gcs_user@cloudshell:~/deployment (ai-gcsim…
-
Hi, I am using Kubernetes cronjob to crawl the different websites.
I followed the tutorial [here](https://www.norconex.com/how-to-run-norconex-collector-in-docker/) to create the docker image and use…
-
When we use the idol-committer it seems unable to extract information from documents smaller than around 8kb.
For small documents neither metadata or content is extracted, only the metadata represent…
-
Hello Pascal,
when crawling pages in some occasion the .cntnt files are empty and the meta tags are not getting extracted.
To reproduce the behavior please have a look the following files
* h…
-
Hello, I have set up a new solr server and configured to use TLS/SSL and have been successful in running Solr with https. I am also able to crawl the site using the Norconex crawler. But I am gettin…
-
When trying to run the crawler on an intranet I am getting: `com.norconex.importer.parser.DocumentParserException: org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apach…
-
Dears,
I am running a connector instance configured with about 40 active crawlers. Sometimes, in the log I read:
java.io.FileNotFoundException: D:\connettori_norconex\norconex-collector-http-2.8.1\.…
-
Dear Mr. Paul,
I am geting the following error while committing to MYSQL 5.7 version
> Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x91\x89 I...' for column 'content' at row…
-
This project has been very helpful, but I've got a roadblock that I can't seem to get around. I've been able to configure the crawler to authenticate against a site and then begin to crawl. However,…