Open frankskywalker opened 9 years ago
Doesn't amazon banned you? I had to work on an Amazon crawling project (not with crawl4j) 2-3 weeks ago at my last workplace. You should query the API with 1 seconds delays otherwise it will ban you. Query amazon.com directly is even a worse business. We used 50+ proxies and 30 sec delays to finally be able and crawll all stuff. It took a toooong time but without the delay the whole proxy list is got banned after 30-40 mins and had to wait 2 days for unabnning.
https://affiliate-program.amazon.com/gp/advertising/api/detail/faq.html
hi yasserg
crawler4j helps me to get data from amazon api, there are many many urls to query, first i add 2*numberOfCrawlers seeds and call Controller.startNonBlocking(), then i add another url when MyCrawler.visit() is called.But i find everytime the crawler4j will stop at a random task.Maybe 180 ,maybe 1000, there still other seeds but it seems the threads are all died.
So,what happened?can you help me?