Closed GoogleCodeExporter closed 9 years ago
Can you attach the source code of your controller class?
Original comment by ganjisaffar@gmail.com
on 19 Sep 2011 at 6:43
Thanks for looking into this. I've attached the controller and crawler code.
The controller is taken from your sample. FYI, I was playing with the settings
and decide to change the number of crawlers to 2 from 1. The error seemed to
have gone after that.
Not sure how the number of crawler option works,probably need to take a look at
your source code to understand.What I observed was, crawler 1 only crawled the
seed page with reference to 1.html and 2.html. Crawler-2 crawled the 1000 links
each inside 1.html and 2.html. Is there any way to control if I need a separate
crawler thread to parse the links inside 1.html and 2.html respectively?
Original comment by sham...@gmail.com
on 19 Sep 2011 at 7:20
Attachments:
Your code seems fine to me. The only reason that I can imagine this problem
happening is when the controller thinks there is no other URL and closes the
database while Crawler 2 is asking for new URLs. The expected behaviour is that
all of the crawlers should be terminated before controller closes the database.
Anyway, if you can checkout the source code and debug it you might find the
exact problem.
-Yasser
Original comment by ganjisaffar@gmail.com
on 22 Sep 2011 at 4:14
This issue should be resolved in version 3.0
-Yasser
Original comment by ganjisaffar@gmail.com
on 2 Jan 2012 at 7:26
Original issue reported on code.google.com by
sham...@gmail.com
on 19 Sep 2011 at 5:51