-
I am attempting to use the httpClientFactory in a complex-config.xml and I am not seeing any attempt by the crawler to authenticate at the server. I have wireshark running, and it is just doing the …
-
Can this exception maybe not lead to all orphans being removed?
-
Hello :). Thank you.
It seems my crawler is rejecting importing documents of 0-length pages.
But I want to collect them and show in the commit result.
How can I handle this?
By using Numer…
-
Hi, I encountered a problem as following:
I had 2 crawlers:
1) vnexpress crawler will commit data into vnexpress type in ElasticSearch.
2) dantri crawler will commit data into dantri type in ElasticSe…
-
In AbstractCrawler.java class, my crawler ran into processNextQueuedCrawlData() method and reached to a case has your TODO message _"Fire an event here? If we get here, the importer did not kick in"_.…
-
I'm intending to use the HTTP collector and have that configured to use the CloudSearch committer. What should I do in order to configure the index fields for CloudSearch?
I can see that CloudSearch …
-
Hello there!
I have been looking for some simple web crawler and I found this project and liked it very much. The problem is, that I can't find any useful tutorials for dummies and don't know how to …
-
Hi,
I am attempting to use this committer, and am getting the following error.
INFO [HttpCrawler] 1 start URLs identified.
INFO [CrawlerEventManager] CRAWLER_STARTED
INFO [Abstrac…
-
I've got the following as my crawler configuration:
```
http://wiki.linaro.org/
#parse("shared/importer-config.xml")
…
-
hi there
I am trying to configure for the first time a commiter on SOLR system, however i did no manage to make it work, i mean to include new documents in the Solr.
could some one please help me?
th…