whitfin / runiq

An efficient way to filter duplicate lines from input, à la uniq.
MIT License
211 stars 23 forks source link

Special characters crash the program #1

Closed kasem123 closed 6 years ago

kasem123 commented 6 years ago

thread 'main' panicked at 'called Result::unwrap() on an Err value: Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }', libcore\result.rs:945:5 note: Run with RUST_BACKTRACE=1 for a backtrace.

caused by a "ü" character on the next line

whitfin commented 6 years ago

@kasem123 yeah, that's understandable because Rust operates in UTF-8 in the standard library.

I can add UTF-16 support, if that would help you? Not sure whether to add a UTF-16 flag, or simply try parse as UTF-16 if the UTF-8 parse fails (so lazy-retry). What do you think?

Edit: I actually can't reproduce this; which system are you using? Perhaps you could share a log file with a sample that causes the crash?

kasem123 commented 6 years ago

Sorry, been away. I'm on windows

E:\filter>runiq debug.txt abcdefg thread 'main' panicked at 'called Result::unwrap() on an Err value: Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }', libcore\result.rs:945:5 note: Run with RUST_BACKTRACE=1 for a backtrace.

debug.txt contents

abcdefg abcdefgü

whitfin commented 6 years ago

Can you attach a file? If I copy/paste, I guess the encoding gets changed and it doesn't reproduce it.

kasem123 commented 6 years ago

debug.txt

whitfin commented 6 years ago

@kasem123 perfect, I can reproduce now. I'll play around with it and see what I can do.

whitfin commented 6 years ago

Hi @kasem123! Given the changes put in place based on #2, this should now also be fixed. I can no longer reproduce the issue and am going to close this (although note you need to check using v1.1.0). If you validate, can you please let me know either way? I can reopen as needed!