[Test]
public async Task Crawl_Synchronous_CancellationTokenCancelled_StopsCrawl()
{
var cancellationTokenSource = new CancellationTokenSource();
var timer = new System.Timers.Timer(800);
timer.Elapsed += (o, e) =>
{
cancellationTokenSource.Cancel();
timer.Stop();
timer.Dispose();
};
timer.Start();
var crawler = new PoliteWebCrawler();
var result = await crawler.CrawlAsync(new Uri("https://github.com/"), cancellationTokenSource);
Assert.IsTrue(result.ErrorOccurred);
Assert.IsTrue(result.ErrorException is OperationCanceledException);
}
But if we change time (from 800ms to 3s) to actually crawler starting to work:
[Test]
public async Task Crawl_Synchronous_CancellationTokenCancelled_StopsCrawl()
{
var cancellationTokenSource = new CancellationTokenSource();
var timer = new System.Timers.Timer(3000);
timer.Elapsed += (o, e) =>
{
cancellationTokenSource.Cancel();
timer.Stop();
timer.Dispose();
};
timer.Start();
var crawler = new PoliteWebCrawler();
var result = await crawler.CrawlAsync(new Uri("https://github.com/"), cancellationTokenSource);
Assert.IsTrue(result.ErrorOccurred);
Assert.IsTrue(result.ErrorException is OperationCanceledException);
}
We have failure which will crash application as unhandled exception
Exit code is -532462766 (Output is too long. Showing the last 100 lines:
at System.Threading.CancellationToken.ThrowIfCancellationRequested()
at Abot2.Crawler.WebCrawler.ThrowIfCancellationRequested()
at Abot2.Crawler.WebCrawler.ProcessPage(PageToCrawl pageToCrawl)
at Abot2.Crawler.WebCrawler.<CrawlSite>b__64_0()
at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
at System.Threading.QueueUserWorkItemCallback.Execute()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
I believe it is the same issue as https://github.com/sjdirect/abot/issues/206 which was closed based on "integration unittest is passing".
This UT is passing:
But if we change time (from 800ms to 3s) to actually crawler starting to work:
We have failure which will crash application as unhandled exception
Issue: there is no way to cancel crawler