ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
190 stars 31 forks source link

Filename expansion of * includes '.' and '..' if FIGNORE is set #58

Closed aweeraman closed 4 years ago

aweeraman commented 4 years ago

Filename expansion using bare asterisk expands . and .. as well.

$ print *
. .. bar foo

The behavior here should be that it excludes the files with leading periods unless the period is specified in the pattern as well.

Ref: 2.13.3.2 Patterns Used for Filename Expansion

Reproduced on: 93u+m 2020-07-02

McDutchie commented 4 years ago

I can't reproduce this, except after setting FIGNORE. Is your FIGNORE variable set?

aweeraman commented 4 years ago

Yes, this is what I have:

$ set | grep -i FIGNORE
FIGNORE='@(*.o|~*)'
McDutchie commented 4 years ago

sh.1 states for FIGNORE:

If FIGNORE is set, then each file name component that matches the pattern defined by the value of FIGNORE is ignored when generating the matching filenames. The names . and .. are also ignored.

So this looks like a bug, even if it can only be reproduced with FIGNORE set, as is the case for me. The manual page states clearly that . and .. are ignored regardless.

I can reproduce this on 93u+ 2012-08-01 as well as s+ 1993-12-28. So if this is a bug, it's a longstanding one. The question is, is the bug in ksh or in the manual page?

JohnoKing commented 4 years ago

This bug was first reported in att/ast#11, with a fix being available in att/ast#1068.

aweeraman commented 4 years ago

I can reproduce this on 93u+ 2012-08-01 as well as s+ 1993-12-28. So if this is a bug, it's a longstanding one. The question is, is the bug in ksh or in the manual page?

Appears to be both.

McDutchie commented 4 years ago

I looked it up in the 1995 KornShell book. The "Pathname Expansion" chapter on page 40 says:

When patterns are used to match pathnames, a . (dot) as the first character of each filename must match explicitly. Each filename must be matched. If the FIGNORE variable is set, then filenames matching the pattern defined by the value of the FIGNORE variable are excluded from the match rather than filenames that contain a leading . (dot).

Not a word about . or .. – it doesn't say that pathname expansion with FIGNORE set excludes . and .., but it also doesn't say that regular pathname expansion without FIGNORE set excludes them, and it does.

So, that didn't help.

McDutchie commented 4 years ago

@aweeraman, how do you figure it's both?

McDutchie commented 4 years ago

Thanks for linking to prior discussion, @JohnoKing. In there, Stéphane Chazelas made a good argument that returning . or .. for any pathname expansion under any circumstances (FIGNORE or not) is inherently broken. I think that overrides backwards compatibility concerns. Plus, the KornShell book is non-committal about this and the man page documents that they should be ignored even if FIGNORE is set. So I'll go ahead and backport the fix from ksh2020.

McDutchie commented 4 years ago

The ksh2020 fix struck me as a bit of a kludge, so I did some digging. Sure enough, the fix is wrong:

$ echo .*
. .#NEWS .. .git .github .gitignore
$ FIGNORE=
$ echo .*
.#NEWS .git .github .gitignore

That's inconsistent. The two should give identical results.

POSIX currently specifies that the expansion of .* includes . and .. (if they physically exist, which is itself not mandatory). However, as Stéphane Chazelas has argued, this is inherently broken. You can't really do any meaningful operations on those special reserved names; they are only useful as pathname components to refer to the current or parent directory.

Currently, to copy all files including hidden .files from the current directory to another one, you need:

cp -pr .[!.]* .??* * somewhere_else/

to avoid some rather disastrous recursion, which is quite absurd really. If . and .. were always excluded from being matched by pathname expansion, you can do it in a much more sensible way:

cp -pr .* * somewhere_else/

Currently, pdksh/mksh (ksh clones, which differ from ksh on this) and zsh already work in this more sensible way. Bash, dash, and yash strictly follow POSIX.

But there are past indications that there wouldn't be much objection to changing that and either allowing or mandating the sensible behaviour. The Austin Group are currently discussing a new draft, so I'll have to give them a nudge about this.

Meanwhile I'm of a mind to just make ksh act like mksh, and always skip . and .. when globbing. What do others think?

JohnoKing commented 4 years ago

I'm not opposed to ksh having the same behavior as mksh and zsh for .*; the current behavior has given me nothing but trouble when handling dot files. Including . and .. during the expansion of .* could be placed behind set -o posix if necessary (see #20).

hyenias commented 4 years ago

As a work around, one can utilize the FIGNORE variable and include the following pattern .?(.) to cause the . and .. relative directory file listings to be removed from the requested glob for file name substitution but still list out all the other hidden files.

Note: One may also wish to apply the nullglob option ~(N) flag on a per-glob basis to avoid returning the glob pattern when there are no results such as in empty directories.

~/empty$ unset FIGNORE
~/empty$ ls
~/empty$ ls -A
~/empty$ ls -a
.  ..
~/empty$ echo *
*
~/empty$ FIGNORE='.?(.)'
~/empty$ echo *
*
~/empty$ echo ~(N)*

~/empty$

Comment: I wonder if creating a new glob option such as ~(A) as similar in the the ls command would be beneficial to introduce the removal of the implied dot directory listings from output as to lessen the impact of such a change in long standing behavior for existing scripts. The new set -o option(s) could be created to apply this default behavior [reference ls: -a/--all or -A/--almost-all]. In turn, if a set -o posix option was enabled; it would set the respective file globbing option to include relative directories if applicable.

octurite commented 4 years ago

The ksh2020 fix struck me as a bit of a kludge, so I did some digging. Sure enough, the fix is wrong:

$ echo .*
. .#NEWS .. .git .github .gitignore
$ FIGNORE=
$ echo .*
.#NEWS .git .github .gitignore

That's inconsistent. The two should give identical results.

They should give different results. A variable being assigned an empty variable is distinct from a variable being unset. For clarity, "If FIGNORE is not set" is different than "If FIGNORE is set to the empty string." The ksh2020 fix is correct and the results match the intention. Setting FIGNORE to the empty string sets the expander specifically to not ignore/exclude any entry including "." and "..". There are scenarios where this actually has a purpose. It could be worth denoting the difference in the docs rather than changing the behaviour. Using unset FIGNORE will cause the output to do what you were expecting.

McDutchie commented 4 years ago

Setting FIGNORE to the empty string sets the expander specifically to not ignore/exclude any entry including "." and "..".

Read the reproducer again, and try it yourself if you like. ksh2020 does the opposite of what you say is correct: . and .. are ignored if FIGNORE is set to the empty string.

McDutchie commented 4 years ago

I was about to flag this up with the Austin Group but then found that @stephane-chazelas already filed a bug about this, and it makes for informative reading. See: https://www.austingroupbugs.net/view.php?id=1228