laurentprudhon / nlptextdoc

Suite of tools to extract and annotate language resources for NLP applications
Other
1 stars 2 forks source link

Unspecified exception in System.IO.FileStream.ValidateFileHandle #12

Closed laurentprudhon closed 5 years ago

laurentprudhon commented 5 years ago

An exception is sometimes thrown in :

at System.IO.FileStream.ValidateFileHandle(SafeFileHandle fileHandle) at System.IO.FileStream.CreateFileOpenHandle(FileMode mode, FileShare share, FileOptions options) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options) at System.IO.StreamWriter..ctor(String path, Boolean append, Encoding encoding, Int32 bufferSize) at nlptextdoc.text.document.NLPTextDocumentWriter.WriteToFile(NLPTextDocument doc, String path) at nlptextdoc.extract.html.WebsiteTextExtractor.WebCrawler_PageCrawlCompletedAsync(Object sender, PageCrawlCompletedArgs e)

Example websites to reproduce this bug :

https://www.montepaschi-banque.fr/fr/- after 216 pages https://www.arkea.com/ - after 586 pages, 858 pages

laurentprudhon commented 5 years ago

Failed to reproduce : will be replaced by more useful issues with a detailed stack trace in further tests if necessary thanks to the new exceptions file.