norconex-importer Search Results

413 results
for norconex-importer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Norconex/crawlers #132

URISyntaxException: Illegal character in path

Here you can see some exceptions I got: ``` MC (crawler): 2015-08-05 17:57:10 WARN - Could not queue extracted URL "http://www.feccoo-extremadura.org/ensenanzaextremadura/Areas_Comunes:Salud_Laboral_…

csaezl updated 9 years ago
18
Norconex/importer #9

OutOfMemoryError: GC overhead limit exceeded

What is best approach to fix this? Exception in thread "pool-1-thread-4" java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.fontbox.ttf.TTFDataStream.readString(TTFDataStre…

OkkeKlein updated 9 years ago
52
Norconex/importer #14

Import RSS Feed

Hi i want to collect pages from rss feed this is my crawler but no result please help me ``` xml ./examples-output/minimum/progress ./examples-output/minimum/logs 4 1 -1 …

LyesHocine updated 9 years ago
11
Norconex/crawlers #106

java.lang.OutOfMemoryError: Java heap space

_Post from @csaezl, moved from https://github.com/Norconex/collector-http/issues/100#issuecomment-100172544_: I've got an error, perhaps not related to the issue itself: From the log: ``` INFO - Sen…

essiembre updated 9 years ago
7
Norconex/crawlers #77

Recrawling and checksums

Does the crawler only look at HttpMetadataChecksummer or also the documentChecksummer to decide whether to redownload pages? A combination of content and modified date would give better indication wh…

OkkeKlein updated 9 years ago
8
Norconex/crawlers #110

Configuration Question

Hi I would like to know how to configure the collector to collect only images reference in url i am writing a custom committer that need to send to REST api all urls in a website that have images in t…

hanabadler updated 9 years ago
2
Norconex/crawlers #138

ignoreExternalLinks="true" processes ExternalLinks

I have a collector with `` that processes the URL `http://www.fexb.es/`. At some time it processes `https://www.facebook.com/r.php?locale=es_ES` web page and others from facebook site. There is anoth…

csaezl updated 9 years ago
12
Norconex/crawlers #109

RegexReferenceFilter

I need the crawler to reject some URLs, those that include `/../`. I use the filter: ``` #set($filterRegexRef = "com.norconex.collector.core.filter.impl.RegexReferenceFilter") ... */\.\./* ```…

csaezl updated 9 years ago
10
Norconex/crawlers #117

How do I open mapdb files?

How do I open mapdb files?

bagheri471 updated 9 years ago
14
Norconex/importer #12

Concatenated first line with certain PDF's

This issue derailed the OOM discussion, so a new issue was created. I added some logging to the ReplaceTransformer and found out that certain PDF's have a concatenated string of the first line (7 tim…

OkkeKlein updated 9 years ago
4

上一页 1...35 36 37 38 39 40 41...42 下一页

413 results for norconex-importer

413 results
for norconex-importer