apache-nutch Search Results

353 results
for apache-nutch

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

nasa-jpl-memex/memex-explorer #717

dump images and dump common crawl have been disabled

intentionally? does this need to be restored? Does the REST API support either of these?

ahmadia updated 8 years ago
17
nasa-jpl-memex/memex-explorer #732

nutch fetched url counts are not being updated

Needs a discussion with Brittain, I don't think this is hard but it is pretty useful. It would be good to chat with the Nutch folks and ask them what other kinds of things are available and make sens…

ahmadia updated 8 years ago
4
ljhsecret/crawler4j #136

JVM crash when running crawler on Centos 6.2

``` What steps will reproduce the problem? Running the crawler crashes the JVM some times. I crawl around 10 web sites regularly with pages between 1K to 50K. This happens randomly but happens very …

GoogleCodeExporter updated 8 years ago
14
anexplore/crawler-commons #50

Add Fetch Report to FetchedResult

``` We have loads of fine grained method available to us via FetchedResult. I think it would be really cool however if we were able to print a report of the FetchedResult including some timing statis…

GoogleCodeExporter updated 9 years ago
8
ztx1491/crawler-commons #50

Add Fetch Report to FetchedResult

``` We have loads of fine grained method available to us via FetchedResult. I think it would be really cool however if we were able to print a report of the FetchedResult including some timing statis…

GoogleCodeExporter updated 9 years ago
8
seantanwh/crawler4j #136

JVM crash when running crawler on Centos 6.2

``` What steps will reproduce the problem? Running the crawler crashes the JVM some times. I crawl around 10 web sites regularly with pages between 1K to 50K. This happens randomly but happens very …

GoogleCodeExporter updated 8 years ago
14
chrismattmann/nutch-python #4

Unable to initialize the Nutch object

I used the following command to initialize the Nutch object. ``` nt = Nutch('crawlTest', urlDir='urls/', serverEndpoint='http://localhost:8081') ``` But it gave me the following error ``` nutch.py:…

antrikss updated 8 years ago
15
ContinuumIO/nutchpy #8

Bad error without pom.xml

Bug report from Shadi Saleh propatrio@gmail.com When installing, the following errors occur ``` [INFO] Scanning for projects... [INFO] [INFO] ----------------------------------------------------…

aterrel updated 9 years ago
2
Intel-bigdata/HiBench #100

Error: org.apache.hadoop.ipc.RemoteException

Hi, I am trying to run HiBench for Nutch indexing. When I try to generate the data for 2Million pages, I get the rollowing error after Map 100% and reduce 100%.If anyone has faced similar issue, ple…

sreelakshmiRajula updated 9 years ago
8
ContinuumIO/nutchpy #18

Unable to use LinkReader

When tried to use link_reader instead of sequence_reader with the following command, ``` import os import nutchpy path = os.path.dirname(nutchpy.__file__) path = os.path.join(path,"ex_data", "crawld…

antrromet updated 8 years ago
1

上一页 1...22 23 24 25 26 27 28...36 下一页

353 results for apache-nutch

353 results
for apache-nutch