Open hlein opened 1 month ago
On Sat, Mar 30, 2024 at 06:22:11PM -0700, hlein wrote:
If perl -n processes an argument of a filename containing a trailing space, the space will be eaten before the file is opened.
This is documented behaviour.
-n and -p are documented (in perlrun) to do
while (<>) { ... }
'while (<>)' is documented (in perlop, "I/O Operators") to do
while ($ARGV = shift) {
open(ARGV, $ARGV);
while (<ARGV>) {
... # code for each line
}
}
and 2-arg open is documented (in perlfunc, "Whitespace and special characters in the filename argument") to strip leading and trailing whitespace.
It's not ideal behaviour, but its been documented that way for 30+ years.
I wonder whether we should add a command-line switch to make <> act like <<>> ?
-- I don't want to achieve immortality through my work... I want to achieve it through not dying. -- Woody Allen
On Sat, Mar 30, 2024 at 06:22:11PM -0700, hlein wrote:
If perl -n processes an argument of a filename containing a trailing space, the space will be eaten before the file is opened.
This is documented behaviour.
while ($ARGV = shift) { open(ARGV, $ARGV);
and 2-arg open is documented (in perlfunc, "Whitespace and special> characters in the filename argument") to strip leading and trailing whitespace.
Aha! Yes, you are right. I've purged 2-argument open()
from my own muscle memory years ago, was not thinking about that being the method -n
uses and its implications for implied whitespace strip.
I wonder whether we should add a command-line switch to make <> act like <<>> ?
That or any kind of pragma that could alter the type of open performed by -n
that could be called in BEGIN
? (I'd be afraid of unintended consequences elsewhere in scripts, unless you meant only for -n
's processing.) Maybe it's possible to hook the 2-arg open performed by -n
and iff file not found and file ends in space and such a file exists (the subsequent newfstatat
finds it, after all), have it retry a 3-argument open? Well, that's ugly.
Hm, it's slightly worse though. In a directory with foo
and foo
, doing find ... -print0 | xargs -0 perl -ne ...
will end up silently processing foo
twice and foo
not at all?
Filenames that end in spaces is silly. Only reason I encountered this was writing some tools to iterate through arbitrary code trees / repositories and do some calculations... but some projects have test-case files that end in spaces on purpose, which tripped me up.
it might be nice to have -N
and -P
that do 3 arg opens without the trimming...
Description If perl -n processes an argument of a filename containing a trailing space, the space will be eaten before the file is opened. Ironically the
openat()
w/mangled name which getsENOENT
is fillowed by anewfstatat()
with the correct name which returns0
(success).Steps to Reproduce
Under strace, we can observe:
Attempting to doctor
@ARGV
inBEGIN
by, say,\
-escaping trailing spaces does not work, we getopen()
with a literal\
but no trailing space, and then anewfstatat()
with both the\
and the space:Expected behavior
Filenames provided as arguments should be preserved; test-case output should be:
Perl configuration