-
ok, I found this: https://opensource.norconex.com/collectors/http/v3/apidocs/com/norconex/collector/http/crawler/HttpCrawlerConfig.html
and was able to get it to pass config test.
Now I am getti…
-
The Crawler doesn't seem to be pulling content for one of my sites. I can see every other field in the data but not the content. The only material difference between this config and my other working…
-
We're running a network disk crawling process for a few months already. We're using Norconex library to crawl the disk, filter by metadata and to commit text to Elasticsearch. After a couple of itera…
-
Apologies for [commenting](https://github.com/Norconex/collector-http/issues/62#issuecomment-355258563) on a closed issue earlier.
If I try changing the line in `log4j.properties` to `log4j.rootLog…
-
Hi,
I'm getting 503 error for most of the urls while crawling the website, but if i load the urls it is working fine. Any idea what will be the issue.
-
I am getting this error all of a sudden in my perfectly fine running application. Is this because of java version ?
```
Nov 18, 2020 8:36:51 AM com.google.enterprise.cloudsearch.sdk.indexing.Inde…
-
Hi,
I am using the filesystem crawler with SQL committer. It works great!
Today I tried running two crawlers at the same time.
consider `crawl-one.variables` which has an associated `crawl-on…
-
Hi,
Thanks again for sharing this crawler.
I feel like this should work
```
${workdir}\logs
${workdir}\progress
${workdir}
…
-
The documentation on the orphanStrategy is very ambiguous at the moment. The statuses appear to be:
IGNORE - do nothing
DELETE - if a path is no longer found then generate a DELETE to the committ…
-
Hi,
I just discovered the Norconex set of libraries and this looks like an impressive software. I’d like to use it as a library in a Java program. After extensive search through the documentation, …