uutils / coreutils

Cross-platform Rust rewrite of the GNU coreutils
https://uutils.github.io/
MIT License
17.54k stars 1.26k forks source link

tr cannot match metaclasses #2920

Closed kevinburke closed 2 years ago

kevinburke commented 2 years ago

I ran the following command:

uname -s | tr "[:upper:]" "[:lower:]"

I expected tr to replace "Darwin" with "darwin" (or "Linux" with "linux"), per the manpage for tr:

     [:class:]  Represents all characters belonging to the defined character class.  Class names are:

                alnum        <alphanumeric characters>
                alpha        <alphabetic characters>
                blank        <whitespace characters>
                cntrl        <control characters>
                digit        <numeric characters>
                graph        <graphic characters>
                ideogram     <ideographic characters>
                lower        <lower-case alphabetic characters>
                phonogram    <phonographic characters>
                print        <printable characters>
                punct        <punctuation characters>
                rune         <valid characters>
                space        <space characters>
                special      <special characters>
                upper        <upper-case characters>
                xdigit       <hexadecimal characters>

                When “[:lower:]” appears in string1 and “[:upper:]” appears in the same relative position in string2, it represents the characters pairs from the toupper mapping in the LC_CTYPE category of the current
                locale.  When “[:upper:]” appears in string1 and “[:lower:]” appears in the same relative position in string2, it represents the characters pairs from the tolower mapping in the LC_CTYPE category of the
                current locale.

                With the exception of case conversion, characters in the classes are in unspecified order.

                For specific information as to which ASCII characters are included in these classes, see ctype(3) and related manual pages.

However, no substitution appeared.

I also tried looking for the manpage for coreutils tr, but I didn't see anything.

tertsdiepraam commented 2 years ago

This was indeed an issue, but it should be solved on main. I just ran this:

❯ echo "Linux" | cargo run --quiet -- "[:upper:]" "[:lower:]"
linux

Does that fix your issue? Note that not all classes you listed are implemented: we don't have "ideogram", "phonogram" & "special". I don't think GNU coreutils has them either.

I also tried looking for the manpage for coreutils tr, but I didn't see anything.

We're not generating manpages yet, but there is some work in progress there. The best we have right now is the --help flag, which admittedly does not provide a lot of info for most utils.

kevinburke commented 2 years ago

Oh, bizarre, you're right.

I'm wondering now if make clean was not cleaning artifacts properly and I was continually reinstalling an older version of tr.

Here's the install script I am using - I added rm -rf target/release explicitly after I noticed make clean was not removing it. https://gist.github.com/kevinburke/34f7f309968eebf637366be78a1a2d62

tertsdiepraam commented 2 years ago

Interesting, maybe it has something to do with the profile? make clean seems to remove $(BASEDIR)/target/${PROFILE} so, coreutils/target/debug if you don't specify the profile. So if you built with --release it might not have removed that?

Edit: looking at the script, you're using make install which indeed implies the release profile. I think this could be considered a bug with the make configuration. It should probably remove everything in target/

kevinburke commented 2 years ago

I guess yeah I'd expect it to do the reverse of whatever make install does.