Closed xfq closed 3 years ago
I can't find evidence of an intention one way or the other. The bullet goes back to OCF 2.0.1 where it also says the same thing with the only difference being a reference to TR21.
It says in charmod that simple folding isn't appropriate for the web, so is that effectively a recommendation we should specify full case folding?
A likely source of the need for case folding is that some file systems (FAT32
classically) are not case sensitive. In these file systems, names that are distinguished solely by case can overwrite one-another causing problems.
You might want to specify Unicode canonical case-fold matching from charmod-norm rather than separately specifying case folding and normalization. I note that you currently have only a "should" for normalization but a "must" for case folding. Is there a reason why you don't require uniqueness across normalization (for which you should choose a normalization--we recommend NFC) (noting too that it's a question of uniqueness checking, not a requirement that the normalized form actually be stored)?
so is that effectively a recommendation we should specify full case folding?
Yes. Full case folds lose less information than simple casefolds, at the cost of potentially altering the length of the string in code points. Among other things, this probably means that the case fold (and normalization, if applied) needs to happen before the length limit is checked--although I note that the case fold and the normalization are not required to be stored. The names just need to be unique across the operations.
Is there a reason why you don't require uniqueness across normalization
I don't have an answer to that, unfortunately, but maybe someone else in the group can chime in (that particular change goes back over a decade and I can't find a discussion about it).
The issue was discussed in a meeting on 2021-04-23
List of resolutions:
https://w3c.github.io/epub-specs/epub33/core/#sec-container-filenames
This sentence does not seem to be very clear. Would you please clarify what algorithm of "case normalization" is used here? Is it Unicode full or simple casefolding?
Please refer to Case Mapping and Case Folding and Additional Considerations for Case Folding in charmod-norm.