laurentprudhon / nlptextdoc

Suite of tools to extract and annotate language resources for NLP applications
Other
1 stars 2 forks source link

Fatal exception while crawling : An item with the same key has already been added #1

Closed laurentprudhon closed 5 years ago

laurentprudhon commented 5 years ago

http://www.creditmutuel.fr/

Time | Pages | Errors | Download | Disk | Parsing | Convert | 0:35:33 | 933 | 54 | 851,6 Mb | 7,3 Mb | 0:07:56 | 0:07:51 |fatal: Error occurred during processing of page [http://www.creditmutuel.fr/groupe/fr/banques/professionnels/infos-pratiques/index.html] fatal: System.ArgumentException: An item with the same key has already been added. Key: https://www.creditmutuel.fr/fr/professionnels.html at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior) at nlptextdoc.extract.html.WebsiteTextExtractor.WebCrawler_ShouldCrawlPageLinks(CrawledPage crawledPage, CrawlContext crawlContext) in C:\Users\laure\OneDrive\Dev\C#\nlptextdoc\nlptextdoc.extract\html\WebsiteTextExtractor.cs:line 157 at Abot.Crawler.WebCrawler.ShouldCrawlPageLinks(CrawledPage crawledPage) in C:\Users\laure\OneDrive\Dev\C#\nlptextdoc\nlptextdoc.extract.dependencies\Abot\Crawler\WebCrawler.cs:line 796 at Abot.Crawler.WebCrawler.ProcessPage(PageToCrawl pageToCrawl) in C:\Users\laure\OneDrive\Dev\C#\nlptextdoc\nlptextdoc.extract.dependencies\Abot\Crawler\WebCrawler.cs:line 689

Crawl of http://www.creditmutuel.fr/ completed with error: An item with the same key has already been added. Key: https://www.creditmutuel.fr/fr/professionnels.html

laurentprudhon commented 5 years ago

Fixed