Norconex / collector-filesystem

Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.
http://www.norconex.com/collectors/collector-filesystem/
21 stars 13 forks source link

REJECTED_NOTFOUND for files with Umlaute like ä #65

Closed schudoku closed 1 year ago

schudoku commented 2 years ago

Files with Umlaute do not get fetched:

INFO [CrawlerEventManager] Crawler: REJECTED_NOTFOUND: file:///C:/dev/test-with-ä.txt

original file name: test-with-ä.txt

Czaniolo commented 1 year ago

Hi, Pascal

I have the same problem with special characters in the portugueses language...

essiembre commented 1 year ago

I was able to reproduce it (on Windows). Marking as a bug.

essiembre commented 1 year ago

A new 2.9.2-SNAPSHOT release was made with a fix. Please give it a try and confirm.

Czaniolo commented 1 year ago

Hi, Pascal

It worked perfectly for me. It solved the Portuguese accentuation problem.

Thanks for the prompt reply.

Best Regards!