mohankreddy / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

1 thread is working, the rest are just waiting at getNextURLs #52

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. run crawler with multiple threads:
controller.start(MyCrawler.class, 10);

What is the expected output? What do you see instead?
expect to see multiple threads, i see only 1 thread working, the rest are 
waiting at getNextURLs

What version of the product are you using? On what operating system?
latest, vista

Please provide any additional information below.

Original issue reported on code.google.com by haggai.s...@gmail.com on 8 Jun 2011 at 2:42

GoogleCodeExporter commented 9 years ago
If you have a few seeds, initially, the first thread is assigned the URLs of 
these seeds and then the others would wait for new URLs to be added to the URLs 
queue. So, this is the expected behavior.

Original comment by ganjisaffar@gmail.com on 8 Jun 2011 at 10:53

GoogleCodeExporter commented 9 years ago
How much exactly that "Few seeds" mean? Also if the crawl is started with max 
depth 0, then how will it utilize the created threads? Is there a work around 
for making it use all the created threads for seed urls?

Original comment by mvi...@owler.com on 6 Jun 2013 at 12:25