norconex-committer Search Results

318 results
for norconex-committer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Norconex/crawlers #489

id is too long, must be no longer than 512 bytes but was

Using Norconex HTTP Collector + Elasticsearch commiter. ``` [non-job]: 2018-05-22 14:58:15 INFO - Version: Norconex HTTP Collector 2.7.0-SNAPSHOT (Norconex Inc.) [non-job]: 2018-05-22 14:58:15 INFO…

aleha84 updated 6 years ago
3
Norconex/crawlers #476

http collectors doesn't crawl in dynamically generates websi…

Hi I would like to extract experts contact information from a site which dynamically generates list of available experts. I saved these dynamically created sites into webpages-list containing fo…

tdrobcsak updated 6 years ago
14
Norconex/collector-filesystem #37

Reference of original inputfile in committer ?

Hi, If I need reference of original input file in my own committer, how can I get that ? Coz the add method contains reference of inputstream after tika extraction. What I want is original input fil…

jayjamba updated 6 years ago
8
Norconex/crawlers #458

Continuous crawling through a queue

I have a question regarding continuous crawling (or scheduling for that matter). I've read your post regarding the similar topics here: https://github.com/Norconex/collector-http/issues/93. But it doe…

wolverline updated 6 years ago
6
Norconex/importer #78

StripBetweenTransformer parsing too literally?

Hello! In reference to [#370](https://github.com/Norconex/collector-http/issues/370), I am trying to eliminate the MENU section of my HTML code, however, I am experiencing issues using the example pr…

kengher updated 6 years ago
2
Norconex/crawlers #485

Configuration to extract only a certain type of files

I need to extract only a certain type of files from a repository, for example the .pdf, ppt, ... I am using this configuration but it does not work. ```xml #set($http = "com.norconex.collect…

javpdiaz updated 6 years ago
3
Norconex/collector-filesystem #29

Writing to XML from crawler output crashes

Tried this: OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream(outputPath), Charset.forName("UTF-8").newEncoder()); After crawler stops, tried saving ... `FilesystemCollec…

pkasson updated 6 years ago
7
Norconex/crawlers #477

Question - how to diagnose hung multi-crawl job

I have a workflow problem. I want to "resume" my crawler every day, and then let it run most of the day, and then "stop" my crawler. However, the collector JVM is no longer executing when it is d…

danizen updated 6 years ago
5
Norconex/crawlers #470

Crawling from some URLs is not possible

Crawling some urls with the following configuration (see below) works the crawler just fine. But with a few common urls it gives unexpectedly the error message (The real url name is intentionally chan…

evaso updated 6 years ago
2
Norconex/crawlers #377

Committing Only Different File Type

I am having issues isolating different crawlers to different types of documents so i can commit to elasticsearch. I want to be able to utilize the different for pdf, xml, html, images etc. What i wou…

zgjonbalaj updated 6 years ago
11

上一页 1...13 14 15 16 17 18 19...32 下一页

318 results for norconex-committer

318 results
for norconex-committer