webcrawler Search Results

1000+ results
for webcrawler

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

RobertWang/go-tour #177

Patch for /solutions/webcrawler.go

``` child URL missing in Printf ``` Original issue reported on code.google.com by `ort...@gmail.com` on 20 Jul 2014 at 4:16 Attachments: - [webcrawler.go.patch](https://storage.googleapis.com/google…

GoogleCodeExporter updated 8 years ago
2
doublek420/go-tour #45

mistake in line 64 of gotour/solutions/webcrawler.go

``` It doesn't reduce the depth when calling crawl recursively, so the the depth is ineffective currently. ``` Original issue reported on code.google.com by `joelai85` on 26 Oct 2012 at 6:18

GoogleCodeExporter updated 8 years ago
2
ljhsecret/crawler4j #131

Internal error in WebURL

``` While crawling the seed http://eventiesagre.it/ I obtain the internal error reported below. I guess the issue is due the crawler finds a URL without a final / . Processing page: [http://eventies…

GoogleCodeExporter updated 8 years ago
8
venkat6/crawler4j #132

PageFetcher.discardContentIfNotConsumed throws a lot of erro…

``` What steps will reproduce the problem? 1. Just start a crawl for any site... 2. 3. What is the expected output? What do you see instead? process the page as per webcrawler process metho What ve…

GoogleCodeExporter updated 8 years ago
2
ljhsecret/crawler4j #278

Add hooks in the webcrawler for better error handling

``` We should add better hooks in the WebCrawler in which we could better control various errors while crawling a certain URL. ``` Original issue reported on code.google.com by `avrah...@gmail.com` …

GoogleCodeExporter updated 8 years ago
2
venkat6/crawler4j #336

Hanging on file process

``` What steps will reproduce the problem? 1. set max size to anything reasonable, im using 1MB 2. start crawling from http://www.ics.uci.edu/~yil8/public_data/PyLOH/?C=S%3BO%3DA 3. watch console Wh…

GoogleCodeExporter updated 8 years ago
4
seantanwh/crawler4j #71

java.lang.NullPointerException

``` What steps will reproduce the problem? java.lang.NullPointerException at edu.uci.ics.crawler4j.frontier.DocIDServer.getDocID(DocIDServer.java:70) at edu.uci.ics.crawler4j.crawler.WebCrawle…

GoogleCodeExporter updated 8 years ago
5
ljhsecret/crawler4j #50

crawler will not follow relative URLs in redirects

``` What steps will reproduce the problem? 1. Take the simple crawler example; remove all calls to controller.addSeed() and replace with this one controller.addSeed("http://dairymix.com/"); 2. This …

GoogleCodeExporter updated 8 years ago
7
venkat6/crawler4j #14

efficiency suggestion

``` PageFetcher.Fetch(Page page) is currently being used by all crawler threads as a utility class, it has become a bottleneck. Instead why dont you put an instance of PageFetcher as an instance v…

GoogleCodeExporter updated 8 years ago
5
ljhsecret/crawler4j #27

Processing of robots.txt causes java.lang.StringIndexOutOfBo…

``` What steps will reproduce the problem? 1.Run crawler for domain with has robots.txt file with 'allow:' instruction (for example http://www.explido-webmarketing.de/) What is the expected output?…

GoogleCodeExporter updated 8 years ago
1

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for webcrawler

1000+ results
for webcrawler