rjbs / App-Uni

command-line utility to find or display Unicode characters
7 stars 3 forks source link

<control-000A> (awk RS) not handled #4

Closed Veraellyunjie closed 7 months ago

Veraellyunjie commented 1 year ago
uni -s "$(awk 'BEGIN { ORS = ""; print RS }')"

prints nothing

uni "$(awk 'BEGIN { ORS = ""; print RS }')"

prints scads of irrelevant characters

raku -e "qx[awk 'BEGIN { ORS = \"\"; print RS }'].uninames>>.print"

prints <control-000A>

rjbs commented 1 year ago

You're not doing a fair comparison. You're using qx in the Raku example, which skips the shell's interpretation. $(...) munges things. For example:

$(awk 'BEGIN { print RS }') | xxd

No output. It's really not clear what you think the problem or solution or desired outcome is.

Veraellyunjie commented 1 year ago

Sorry for not being clear enough.
The example with raku means that there is a <control-000A> character in awk's output.
Which I couldn't manage to "catch" with uni.
Since uni is a tool for displaying characters and it doesn't display the character, I thought it might need attention or clarification.

My intended question is whether uni can "catch" and display special characters like <control-000A>. Like when you feed awk 'BEGIN { ORS = ""; print RS }' into uni and get <control-000A> from it.
If yes, how? And maybe add the explanation to the manual... If no, can it be implemented?

In the uni "$(awk 'BEGIN { ORS = ""; print RS }')" - which prints and prints and prints, which doesn't look DWIM to me - is it the desired behavior?

rjbs commented 7 months ago

I did not dig into this too far, but it seems this is a shell issue.

snowdrop:~$ perl -MData::Dumper -E 'say Dumper(\@ARGV)' $(echo -e '\x65')
$VAR1 = [
          'e'
        ];

snowdrop:~$ perl -MData::Dumper -E 'say Dumper(\@ARGV)' $(echo -e '\x0a')
$VAR1 = [];

You're not asking the shell to deal with the arg string in your raku example, but you are in Perl.

If you remove shell, it all works.

snowdrop:~$ perl -MIPC::Run3 -E 'my @output; run3([ "uni", chr(0x0a) ], undef, \@output); use Data::Dumper; print Dumper(\@output)'
$VAR1 = [
          '  - U+0000A - LINE FEED
'
        ];
Veraellyunjie commented 7 months ago

I agree, not an issue of App::Uni per se, shell interferes here, but since App::Uni is an app to be used via shell, it is something to be aware of. If it can't be fixed or worked around - you can't expect users to run it like this: perl -MIPC::Run3 -E 'my @output; run3([ "uni", chr(0x0a) ], undef, \@output); use Data::Dumper; print Dumper(\@output)' - perhaps just mention it? Some projects have a CAVEATS section in their manpage, besides BUGS, for such things...

rjbs commented 7 months ago

There is no reason for uni to add this note that doesn't apply to basically every other program you could run in the shell. I don't expect users to run it with IPC::Run3 as shown, but neither do I expect them to run it with the strange awk invocation. It's just misuse of shell, with the program itself doing nothing special or unusual.

You never stated your use case, but I wonder whether xxd might not solve it. Or maybe what you want is a means to pipe input for the -c mode. But there is no bug nor program-specific problem here.