norconex-importer Search Results

413 results
for norconex-importer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Norconex/crawlers #456

No successful imports in 2.8.0 (NullPointerException and mor…

I just tried to upgrade from 2.7.1 to 2.8.0 in my test environment. I didn't touch my configuration (which works in 2.7.1) at all. It looks like I don't get any successful imports. I get two kinds …

ronjakoi updated 6 years ago
3
Norconex/crawlers #460

Logical combination of filters

We need to filter documents without the header Content-Length **and** with the header Transfer-Encoding set to chunked. This is the importer configuration I came up with: ``` .*ch…

FuePi updated 6 years ago
2
Norconex/crawlers #450

PDF files are not extracted

I think this is a Tika issue; I looked into it and it seems it was resolved before. I wonder if you ever come across this error. The message I have is: `WARN [Importer] Could not import https://xx…

wolverline updated 6 years ago
6
Norconex/importer #72

TIKA-198: Illegal IOException from org.apache.tika.parser.mi…

I am receiving many errors in what looks like files which have an embedded file. In my case, .msg files (exchange messages) containing attachments. Files (.msg) without attachments appear to be impo…

jmrichardson updated 6 years ago
9
Norconex/crawlers #447

Extract content of only <h2>tags of the page

I want to extract the text present inside all the `` tags in the page i am crawling. I have created a field named "pagecontent" with collection(Edm.string) type and used below setting to fetch the te…

avi7777 updated 6 years ago
6
Norconex/collector-filesystem #21

Custom meta data fetcher fetching metadata from external pro…

[MyMetadataFetcher.zip](https://github.com/Norconex/collector-filesystem/files/1655329/MyMetadataFetcher.zip) Hi, Is there a way to fetch meta data from external properties file. We have integrate…

jayjamba updated 6 years ago
7
Norconex/crawlers #442

Unable to commit more than 200documents in one execution of …

I am trying to crawl sitemap xml file which includes bulk urls and commit the documents to azure service. There will be more than 400 documents getting stored in the committer-queue directory. Norcon…

avi7777 updated 6 years ago
5
Norconex/crawlers #429

Exclude Documents with empty Document id

Hi, I keep getting the following error when committing documents to AWS CloudSearch: ``` CloudSearch: 2017-11-20 15:15:40 INFO - Sending 10 documents to AWS CloudSearch for addition/deletion. …

jerrywithaz updated 6 years ago
7
Norconex/importer #70

RenameTagger doesn't seem to be working

Trying to rename a field, but doesn't seem to work for me. The original "title" field is still in the data. I tried both

jacksonp2008 updated 6 years ago
2
Norconex/crawlers #370

Eliminate header and footer data form crawled data.

Hi Pascal, Can you help me to avoid the header and footer data from a page being crawled Please find below the [htmlfile _l2tm.txt](https://github.com/Norconex/collector-http/files/1208703/htmlf…

Navaminavu updated 6 years ago
7

上一页 1...16 17 18 19 20 21 22...42 下一页

413 results for norconex-importer

413 results
for norconex-importer