s5bug / utf8string

UTF-8 and Codepoint Strings for Scala
0 stars 0 forks source link

Handling malformed input #1

Open s5bug opened 2 years ago

s5bug commented 2 years ago

What should happen when invalid UTF-8 is received? Should there be an option to throw an exception, or should all invalid sequences fail silently with s? Should it be an option?

https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt implies that recovering is a better option than crashing.

s5bug commented 2 years ago

I should look at Java's built-ins for this... It allows specifying an option of whether to throw or replace, and how that should be done. But I'll most likely want more than what Java allows, i.e. emitting where errors are and why they happened instead of just failing on the first.