hound-search / hound

Lightning fast code searching made easy
MIT License
5.64k stars 575 forks source link

Issue with UTF16 files #78

Open mattdurham opened 9 years ago

mattdurham commented 9 years ago

We have a mixture of UTF8 and UTF16 files. Hound seems to search them correctly, as in it will say 16 files searched, but only show 4. When I go and manually switch save as UTF8 and check in then the files will show in the UI.

jklein commented 9 years ago

Interesting, thanks for the report. I haven't had a chance to reproduce yet, but it seems like this should be easy to confirm and potentially easy to fix.

kellegous commented 9 years ago

I think we can just sniff for a BOM in the first two bytes and re-encode the file to utf-8 on the fly. It will add some overhead to the indexing for repos with utf-16 but that's unavoidable anyway because Go's strings are utf-8 internally.