Crawler Stop Working Without Throwing an Exception

When I set the maxDepth variable to 100 and update the Seed.txt list with the following list :

https://www.goal.com/en https://www.sofascore.com/ https://www.marca.com/en/?intcmp=BOTONPORTADA&s_kw=portada https://www.thesun.co.uk/sport/football/ https://www.realmadrid.com/en-US https://www.manutd.com/ https://www.liverpoolfc.com/ https://www.bbc.com/sport/football https://www.skysports.com/ https://www.bleacherreport.com/world-football https://www.football365.com/ https://www.fourfourtwo.com/ https://www.theguardian.com/football https://fcbayern.com/en https://www.mancity.com/ https://www.fcbarcelona.com/en/ https://www.fifa.com https://www.uefa.com https://www.premierleague.com https://www.laliga.com https://www.bundesliga.com https://www.transfermarkt.com https://www.ligue1.com

The Crawler stop at a random point each time without throwing an exception, logging to console Error in robots.txt: then a message. Then Stop working without logging any thing, I wait around 15 minutes and nothing is logged to the console or any exception. I've changed the Seed.txt and It seems working fine, So I think there are some websites make the crawler not working or stop doing its job related to robots.txt file.

Also when increasing the maxDepth variable to 100 the crawler also stop working without throwing any exception.

Another thing to consider (I Don't Know if it's already handled) that there must be stop conditions on crawling. When the crawler start to crawl Goal.com it crawl without stopping and no other links in Seed.txt is Crawled sot I had to stop it to prevent making all crawled websites from the same original link (Goal.com).

Our goal is to crawl around 6000 page so we need to solve this issue to make the Seed.txt large as possible.

I may mis-understood something so If any one has a comment please let me know.

Thank you.

Roshdy23 / Playmaker

Crawler Stop Working Without Throwing an Exception #13

When I set the maxDepth variable to 100 and update the Seed.txt list with the following list :