Open neodyne opened 4 days ago
EDIT: same applies to paths with trailing spaces like
cd '~/dir '
Can you clarify what "same" means?
Same as all the above but e.g. (quoted) '~/abc '
instead of (unquoted) ...
?
Or same only as the "further changing into subdirs..." part?
Or something else?
~ $ cd 'Downloads '
~/Downloads $ echo "x${PWD}x"
xC:/Users/abc/Downloads x
same as all above, including applet behavior and subdirs.
doesn't happen with other characters afaict ([a-zA-Z0-9] + tested a few other ascii chars). For completion's sake, happens also with a trailing string of both spaces and dots:
~ $ cd 'Downloads .... ... ..'
~/Downloads .... ... .. $ echo $PWD
C:/Users/abc/Downloads .... ... ..
Searching further, this seems like intended Win32 API behavior, to "strip trailing spaces and dots from the final component". However, consider in windows command prompt:
Microsoft Windows [Version 10.0.19045.5131]
C:\Users>cd "abc ... ... .. "
C:\Users\abc>
C:\Users\abc>echo %CD%
C:\Users\abc
C:\Users\abc>cd Downloads
C:\Users\abc\Downloads>
So even that cd
in busybox-w32 sh succeeds may be expected behavior (implemented in winapi after all), $PWD should probably end up with a path with trimmed spaces/dots, so that changing into subdirs works.
But if a script expect to have 2 separate folders "abc" and "abc.", then it will mess up things. I think it is safer to make commands (cd / mkdir / mv for folder) fails with these chars at the end.
Yes, it's been broken forever. It also affects files, not just directories. So:
~ $ touch empty
~ $ ls -l empty.
-rw-rw-r-- 1 rmy rmy 0 Nov 18 16:52 empty.
~ $ ls -l empty..
-rw-rw-r-- 1 rmy rmy 0 Nov 18 16:52 empty..
~ $ ls -l empty...
-rw-rw-r-- 1 rmy rmy 0 Nov 18 16:52 empty...
The Windows API really does ignore all trailing periods and spaces.
But if a script expect to have 2 separate folders "abc" and "abc.", then it will mess up things.
The script has no right to expect any such thing on Windows. You can't have separate folders called abc
and abc.
on Windows.
As they say:
Do not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not.
@rmyorston could code responsible for $PWD strip trailing dots/spaces, so that operations requiring the current working directory would work after the first cd
, the same way it works in windows command prompt?
to explain the difference:
in cmd.exe:
C:\Users\abc>cd "Downloads ... ... .. ."
C:\Users\abc\Downloads>cd testdir
C:\Users\abc\Downloads\testdir>
in busybox sh:
~$ cd "Downloads ... ... .. ."
# here `PWD=~/Downloads` would be nice, so the next `cd` doesn't fail
~/Downloads ... ... .. . $ cd testdir
sh: cd: can't cd to testdir: No such file or directory
@neodyne Sure, I'll give it some thought.
The script has no right to expect any such thing on Windows. You can't have separate folders called
abc
andabc.
on Windows.As they say:
Do not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not.
The script might not be specifically for Windows but a general universal POSIX script; so it may override a file from a folder not intended if the author don't know, so fails is safer (in my opinion).
I also think it should fail if trying to use a file or path name which windows would strip of trailing junk.
But trying to reject some inputs without first checking whether windows is OK with them can be a slippery slope of bad things.
This is a complex subject.
In some cases bb can detect that after the fact, like after a successful "cd", it can read the CWD from windows itself, and updates PWD accordingly (and display it correctly at the prompt etc).
This may have precedent, for instance cd x
and cd x/
both result in the same PWD without the trailing /
, but I don't know at which stage the slash is removed.
But in other cases I don't think bb can know, like in cat 'foo '
, so it probably have to reject it in advance, and if it doesn't reject it in advance then it has no idea which actual file name was written read (but also no idea if it wants to).
@avih @ale5000-git not that I disagree, I'd like to see to see it fail on nonexistent paths as well, I was just trying to be realistic, considering busybox-w32 is built on top of win32api.
seems that UNC extended paths ( \\?\C:\
) don't do path normalization[*] (so dot/space trimming as well), can only use "\" as path separator and don't resolve "." and ".." components; this may be useful but creates additional challenges. As was mentioned above, it would still not allow coexisting "abc" and "abc." directories/files.
for reference, see e.g. https://github.com/python/cpython/issues/84419 , apparently os.stat() in python succeeds in a situation similar to this as well.
[*] An extended path only skips normalization in an open or create context.
Since the shell already makes changes to the path in the cd
builtin it isn't too much of a stretch to have it strip the trailing dots and spaces too. With this change cd
should behave more like cmd.exe.
There are new prerelease binaries with this change (PRE-5532 or above).
@rmyorston small quirk, in some cases the trimmed path now ends with a slash:
~ $ cd .........
~/ $ echo $PWD
C:/Users/abc/ <-- trailing slash
@avih @ale5000-git not that I disagree, I'd like to see to see it fail on nonexistent paths as well, I was just trying to be realistic, considering busybox-w32 is built on top of win32api.
I'm not sure if busybox just blindly call the Win API but in cases it check if the file/dir path is valid before calling the Win API it could simply return invalid path if the last char is space or dot, so it will works globally with a simple change.
Also, in busybox sh {pre-,}release, single trailing dot in any path component succeeds...
~ $ cd 'Downloads/testdir /testdir2' # fails, ok
~ $ cd 'Downloads/testdir /testdir2' # fails, ok
~ $ cd Downloads/testdir....../testdir2 # fails, ok
~ $ cd Downloads/testdir./testdir2 # succeeds, PWD not overwritten; this should probably fail too
~/Downloads/testdir./testdir2 $
(All of the above succeeds in cmd.exe, with an overwritten CWD, but hey, that's Windows for ya.)
@ale5000-git my understanding is that it checks if the file/dir path exists using winapi, so it returns that the path is valid. Some winapi functions can take extended UNC paths, which skip path normalization, but not all.
If I'm not wrong the function call stat
of busybox, so can't stat
be changed to say that these pathes do not exist?
For example:
stat ...
in some cases the trimmed path now ends with a slash
Fair point. This further patch should handle that case:
diff --git a/win32/mingw.c b/win32/mingw.c
index 6842dba48..aab09b8e9 100644
--- a/win32/mingw.c
+++ b/win32/mingw.c
@@ -2189,6 +2189,9 @@ void FAST_FUNC strip_dot_space(char *p)
while (end > start && (end[-1] == '.' || end[-1] == ' ')) {
*--end = '\0';
}
+
+ if (--end != start && (*end == '/' || *end == '\\'))
+ *end = '\0';
}
size_t FAST_FUNC remove_cr(char *p, size_t len)
I'm inclined not to bother with the trailing dot on intermediate directories. That would require more code. Plus, it sorts itself out when the intermediate directory becomes the last:
~ $ cd Downloads/testdir./testdir2
~/Downloads/testdir./testdir2 $ cd ..
~/Downloads/testdir $
so can't
stat
be changed to say that these pathes do not exist?
Sure, but there are calls to scores, if not hundreds, of C runtime and WIN32 functions. It's impractical to intercept them all. Intercepting only a few is sure to lead to inconsistencies.
I thought they first call a function inside busybox codebase and only after the C runtime / WIN32 functions so I'm wrong.
There is also the inconsistency that this works:
stat ...
but this no:
ls ...
version: BusyBox v1.37.0-FRP-5467-g9376eebd8 busybox64u.exe sh -l
(some?) applets behave like being run from the directory without the trailing dots:
further changing into subdirs, existing or not, fails:
EDIT: same applies to paths with trailing spaces like
cd '~/dir '