Open mhsr21 opened 1 month ago
Some entries have the same source and destination text, e.g.
@mtmail Some of the duplicates were present before my contribution--should I remove them altogether?
@mhsr21 Would be great if you can remove the other duplicates, too. I see 42, and 41 of those are in the variants-en.yaml
file.
cat settings/icu-rules/variants-* | perl -ne '/^\s+-\s+(.+?)\s+->\s+(.+)/ && $1 eq $2 && print' | wc -l
42
@mtmail Got rid of all duplicates in variants-en.yaml
, plus the other one in variants-fr.yaml
Thanks for going through this. I agree that we should have official abbreviations in this list.
On a more general note, the US abbreviation list has always had the problem that it is far too long. In particular, it has the problem that it proposes sometimes 3 or 4 variants for the same word. This has a negative effect on the size of the index. Would it make sense to restrict ourselves to the official abbreviations only or are the other ones are just as frequently used?
I only added the official abbreviations (the rightmost column on the linked website). Edit: I also fixed the typo.
Added USPS's Standard Suffix Abbreviation for postal addressing (https://pe.usps.com/text/pub28/28apc_002.htm)