Closed kirelagin closed 1 year ago
The simplest possible change would be to s/an adjacent file/an adjacent UTF-8-encoded text file/
, however it might be worth it to make this even more explicit and split into a separate sentence or something.
A problem I see with making this a requirement is places where UTF is not an option.
For example, already if you look at the Linux kernel source code, it is encoded in ASCII. I suspect other software that is aimed to be embedded in hardware will have a similar limitation.
I do think we should at least encourage UTF. How about adding That file SHOULD be UTF-8-encoded.
instead? So a (hard) suggestion, but not a requirement.
For example, already if you look at the Linux kernel source code, it is encoded in ASCII
UTF-8 is an ASCII-compatible encoding (a superset of ASCII where every byte value that is allowed in ASCII means the same thing in UTF-8), so every ASCII text file is automatically a valid UTF-8 text file.
UTF-8 is an ASCII-compatible encoding (a superset of ASCII where every byte value that is allowed in ASCII means the same thing in UTF-8), so every ASCII text file is automatically a valid UTF-8 text file.
True, and that is a great feature of Unicode encodings :)
True, and that is a great feature of Unicode encodings :)
Not Unicode encodings in general, just UTF-8. UTF-16, UCS-2 and UCS-4 are not ASCII supersets.
Currently the spec only says:
In other words, it only specifies the name of the file, but does not clearly specify the contents.
Therefore, I propose to: