seantanwh / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

java.lang.NullPointerException #71

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
java.lang.NullPointerException
    at edu.uci.ics.crawler4j.frontier.DocIDServer.getDocID(DocIDServer.java:70)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:141)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:108)
    at java.lang.Thread.run(Unknown Source)
Exception in thread "Crawler 2" java.lang.NullPointerException
    at edu.uci.ics.crawler4j.example.simple.MyCrawler.shouldVisit(MyCrawler.java:41)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:150)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:108)=============

    at java.lang.Thread.run(Unknown Source)
java.lang.NullPointerException
    at edu.uci.ics.crawler4j.frontier.DocIDServer.getDocID(DocIDServer.java:70)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:141)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:108)
    at java.lang.Thread.run(Unknown Source)
Exception in thread "Crawler 1" java.lang.NullPointerException
    at edu.uci.ics.crawler4j.example.simple.MyCrawler.shouldVisit(MyCrawler.java:41)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:150)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:108)
    at java.lang.Thread.run(Unknown Source)
What is the expected output? What do you see instead?

What version of the product are you using? On what operating system?
svn

Please provide any additional information below.

Original issue reported on code.google.com by contactm...@gmail.com on 15 Aug 2011 at 3:25

GoogleCodeExporter commented 8 years ago
I've noticed this same problem. I am using OpenJDK Debian 6.0 :-)

Original comment by jost...@gmail.com on 16 Sep 2011 at 4:02

GoogleCodeExporter commented 8 years ago
I have the same problem.

Original comment by dkuang1...@gmail.com on 31 Oct 2011 at 2:56

GoogleCodeExporter commented 8 years ago
In my case this problem is caused by pages which area Temporarily or 
Permanently Moved and redirects to relative URL.

Original comment by vundic...@gmail.com on 18 Nov 2011 at 3:38

GoogleCodeExporter commented 8 years ago
This issue is resolved in version 3.0

-Yasser

Original comment by ganjisaffar@gmail.com on 2 Jan 2012 at 7:30

GoogleCodeExporter commented 8 years ago
Hi Yasser,

We're still getting this issue with version 3.3 that released Feb 17th. 

Here's a snapshot of the call stack:

java.lang.NullPointerException: charsetName
    at java.lang.String.<init>(String.java:441)
    at java.lang.String.<init>(String.java:515)
    at edu.uci.ics.crawler4j.parser.Parser.parse(Parser.java:66)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:276)
    at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:189)
    at java.lang.Thread.run(Thread.java:680)

Any help you can provide is much appreciated! :)

Original comment by jaredot...@gmail.com on 26 Feb 2012 at 8:06