Closed StefanKarpinski closed 1 year ago
Base: 97.31% // Head: 97.37% // Increases project coverage by +0.06%
:tada:
Coverage data is based on head (
d7f8b3d
) compared to base (951955b
). Patch coverage: 100.00% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
The rules are based on https://stackoverflow.com/a/31976060/659248. @vtjnash, do you have an opinion about whether we should actively prevent people form making paths with these names on Windows?
Some of them seem pretty much like we have to: if you have a file named foo\bar
in a tarball, if you try to extract it on Windows you will not get a file with that name, you'll get a directory named foo
with a file inside of it named bar
, which is not what should happen. There's simply no way to get a file named foo\bar
since \
is a path separator on Windows. With some of the other ones, I'm not sure—the docs say it's a bad idea to create files with some of these names but also says it can be done programmatically. Not sure where to draw the line.
I think it makes sense to check. It is a bit of a risk to have files that could actually control physical hardware and cause the printer to operate. Similar to the (hypothetically legal) operation to specify an absolute path for the extraction destination (or equivalently to create a symlink to somewhere else like /usr, then extract files through that symlink, or through the ..
symlink–all of which are normally rejected by bsdtar
and gnutar
on extract, even though they are legal)
It may also make sense for portable
to check for case-clashes within a set of paths? But that can be a separate issue opened after merging this.
It may also make sense for
portable
to check for case-clashes within a set of paths? But that can be a separate issue opened after merging this.
Oh, that's a good idea! We have had issues with people generating artifacts on Linux with case clashes that then causes problems on MacOS and Windows.
I think it makes sense to check. It is a bit of a risk to have files that could actually control physical hardware and cause the printer to operate. Similar to the (hypothetically legal) operation to specify an absolute path for the extraction destination (or equivalently to create a symlink to somewhere else like /usr, then extract files through that symlink, or through the
..
symlink–all of which are normally rejected bybsdtar
andgnutar
on extract, even though they are legal)
That makes sense. We do already prevent attacks that can overwrite files outside of the extraction root and preventing this kind of thing seems like a good idea as well. I think we'd want to turn the portable
option on when we uses Tar.rewrite
to normalize package and artifact tarballs (cc @staticfloat), which would prevent people from creating tarballs that could cause problems on Windows machines. We could potentially not check for portability for platform-specific tarballs that won't be installed on Windows machines.
A couple more things I'm left wondering about...
LPT.
ok? Previously the trailing dot rule excluded it even if the LPT.*
exclusion didn't apply. If we're ok with names ending in dots though I'm not sure if LPT.
is a problem or not. Do you know if Windows parses that as an LPT file with an empty extension?"*:<>?\\|
are actually that bad? Clearly \
can't be allowed as discussed before and :
seems like it has similar issues, at least in the first part of a path (is it ok in later parts where it can't be interpreted a part of a drive name?). But *
and ?
seem to just be excluded because Windows made the poor choice of handling wildcard expansion in the kernel instead of the shell, but it seems like you can programmatically create and work with files with those characters in their names, it just might be hard to interact with them from a shell. Not sure about the remaining <>|
.I think those are all disallowed at the Win32 library layer and require \? paths to bypass it. I didn’t check though
Canonical reference for these rules: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
Do not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not.
That page doesn't really spell out what bad stuff (if any) happens if you try to name files using the "reserved characters". If it's not a security hazard to have files with this characters in their name, it might be better to just let the operating/file system object if it really wants to and otherwise let the files be created. The main use case for this, after all, is to create artifacts and package trees and it's unlikely that a user will directly interact with the contents. On the other hand, they are weird characters to use in file names.
We could just use \?\\
paths for all access here (arguably we could do that everywhere to better match how UNIX systems behave, or just do it when a path has weird characters in it).
@staticfloat What do you think of this? Can we scan existing packages and artifacts to see how many would be affected if we started erroring on these things on Windows? Hopefully none, but you next know. Some of these names are awfully basic...
The documented list of malicious paths is apparently missing many things, according to https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
Check for bad Windows paths prior to extraction; allow check on creation Previously, we didn't bat an eye at extracting paths that are considered bad or illegal on Windows. This explicitly checks for such paths before attempting extraction on Windows. On all platforms it introduces the
portable
flag for tarball creating operations, which does the same checks and errors if you would have created a tarball that would error upon extraction on Windows.