dandavison / delta

A syntax-highlighting pager for git, diff, grep, and blame output
https://dandavison.github.io/delta/
MIT License
21.36k stars 360 forks source link

🐛 Treats UTF-16 text files as binary files #1614

Open nozwock opened 5 months ago

nozwock commented 5 months ago

Is this intended? Converting UTF16 files to UTF8 to do a diff would be a little cumbersome...

dandavison commented 5 months ago

Hi @nozwock, I believe the codebase is well set up to attempt decoding of non-utf8 encodings: https://github.com/dandavison/delta/blob/e208f4ed52759fc018ee14808717c977df57b56f/src/delta.rs#L186

but it doesn't seem to get asked for often and no-one has actually added fallback to attempting other encodings.

As you can see, it attempts to decode as utf-8 and, if that fails, uses Rust's from_utf8_lossy function. I can imagine the results aren't terribly helpful for utf-16.

nozwock commented 2 months ago

What do you think about incorporating an encoding option into the CLI, and then decoding based on that?