Open GoogleCodeExporter opened 9 years ago
This could be a lot of things. You will need to look through the logs to
debug what pages are found.
Steven
Original comment by sjdir...@gmail.com
on 29 Sep 2014 at 1:55
Hello
I've solve the issue, it seems that if you use the configuration object and not
the configuration file you need to put all the parameters in the code. I took
the example in your quick started guide.
I need to put this lines to get the crawler rolling:
CrawlConfiguration crawlConfig = new CrawlConfiguration();
crawlConfig.CrawlTimeoutSeconds = 0;
crawlConfig.MaxConcurrentThreads = max_threads;
crawlConfig.MaxPagesToCrawl = 0;
crawlConfig.MaxPagesToCrawlPerDomain = 0;
crawlConfig.MaxCrawlDepth = 1000;
crawlConfig.UserAgentString = agent;
crawlConfig.MaxPageSizeInBytes=0;
crawlConfig.DownloadableContentTypes="text/html, text/plain";
crawlConfig.IsUriRecrawlingEnabled=false;
crawlConfig.IsExternalPageCrawlingEnabled=false;
crawlConfig.IsExternalPageLinksCrawlingEnabled=false;
crawlConfig.HttpServicePointConnectionLimit=200;
crawlConfig.HttpRequestTimeoutInSeconds=15;
crawlConfig.HttpRequestMaxAutoRedirects=7;
crawlConfig.IsHttpRequestAutoRedirectsEnabled=true;
crawlConfig.IsHttpRequestAutomaticDecompressionEnabled=false;
crawlConfig.MinAvailableMemoryRequiredInMb=0;
crawlConfig.MaxMemoryUsageInMb=256;
crawlConfig.MaxMemoryUsageCacheTimeInSeconds=0;
crawlConfig.MaxCrawlDepth=1000;
crawlConfig.IsForcedLinkParsingEnabled = true;
Congratulation to you crawler, it's a awesome piece of code. Thank you very
much.
Original comment by DLopezGo...@gmail.com
on 30 Sep 2014 at 9:23
Original issue reported on code.google.com by
DLopezGo...@gmail.com
on 29 Sep 2014 at 10:58