seguid / seguid-tests

Unit tests for any SEGUID implementations
https://www.seguid.org
0 stars 0 forks source link

Run tests in different locales #9

Closed HenrikBengtsson closed 6 months ago

HenrikBengtsson commented 6 months ago

Issue

Some languages don't order the alphabet in the same way;

Examples

$ (export LC_COLLATE="C"; printf "%s\n" {A..Z} {a..z} {0..9} | sort | tr $'\n' ' ')
0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z

$ (export LC_COLLATE="en_US.utf8"; printf "%s\n" {A..Z} {a..z} {0..9} | sort | tr $'\n' ' ')
0 1 2 3 4 5 6 7 8 9 a A b B c C d D e E f F g G h H i I j J k K l L m M n N o O p P q Q r R s S t T u U v V w W x X y Y z Z

$ (export LC_COLLATE="et_EE.utf8"; printf "%s\n" {A..Z} {a..z} {0..9} | sort | tr $'\n' ' ')
0 1 2 3 4 5 6 7 8 9 A a B b C c D d E e F f G g H h I i J j K k L l M m N n O o P p Q q R r S s Z z T t U u V v W w X x Y y

Notice how Z < T in the latter case.

Action

See also

In https://github.com/seguid/seguid-dont-use/issues/37, it was shown that R respects LC_COLLATE settings, whereas Python orders by the C locale.