Closed sebres closed 1 year ago
Thank you, pull request rebased and integrated.
Thx for review and merge. May I ask about the reason of rebase? (was something wrong in the commit comments?)
The comments were okay, but the line lengths of them are 65 ...
I also add signed-off ... I should maybe put some skeleton things into .github
OK, understand.
By the way, by fixing of that here I thought have been finding similar vulni on ReactOS and wine, just...
It looks like in case of ReactOS and wine (for the function CommandLineToArgvW
), one can say there is nothing and it works as expected, but in case of 7-zip, I'm unsure a bit.
Script langs like python
, tcl
and co follow the same mechanisms like it is currently implemented here, but other programs using CommandLineToArgvW
or main
would do that differently.
command line (arguments after exe) | CommandLineToArgvW / main | python, tcl, etc | ||||||
---|---|---|---|---|---|---|---|---|
C | A1 | A2 | A3 | C | A1 | A2 | A3 | |
three"""quotes next | 2 | three"quotes | next | 1 | three"quotes next | |||
four"""" quotes" next 4%3=1 | 3 | four" quotes | next | 4%3=1 | 2 | four" | quotes next 4%3=1 | |
five"""""quotes next | 2 | five"quotes | next | 1 | five""quotes next | |||
seven""""""" quotes" next 7%3=1 | 3 | seven"" quotes | next | 7%3=1 | 3 | seven""" quotes | next | 7%3=1 |
twelve""""""""""""quotes next | 2 | twelve""""quotes | next | 2 | twelve"""""quotes | next | ||
thirteen""""""""""""" quotes" next 13%3=1 | 3 | thirteen"""" quotes | next | 13%3=1 | 3 | thirteen"""""" quotes | next | 13%3=1 |
"two""quotes next | 2 | two"quotes | next | 1 | two"quotes next | |||
"two"" next | 2 | two" | next | 1 | two" next | |||
"three""" quotes" next 4%3=1 | 3 | three" quotes | next | 4%3=1 | 2 | three" | quotes next 4%3=1 | |
"four""""quotes next | 2 | four"quotes | next | 1 | four""quotes next | |||
"six"""""" quotes" next 7%3=1 | 3 | six"" quotes | next | 7%3=1 | 3 | six""" quotes | next | 7%3=1 |
"eleven"""""""""""quotes next | 2 | eleven""""quotes | next | 2 | eleven"""""quotes | next | | ||
"twelve"""""""""""" quotes" next 13%3=1 | 3 | twelve"""" quotes | next | 13%3=1 | 3 | twelve"""""" quotes | next | 13%3=1 |
"the crazy \\"""\\" quotes | 2 | the crazy \"\ | quotes | 1 | the crazy \"\ quotes |
The way how the parsing is implemented in this PR and merged showing the right part of table (like script langs doing it). If it should be rather like a left part of table (like native windows API / std libs doing it) one'd need to apply this patch:
CPP/Common/CommandLineParser.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/CPP/Common/CommandLineParser.cpp b/CPP/Common/CommandLineParser.cpp
index d07f0f1..6bea75b 100644
--- a/CPP/Common/CommandLineParser.cpp
+++ b/CPP/Common/CommandLineParser.cpp
@@ -51,7 +51,7 @@ static const wchar_t * _SplitCommandLine(const wchar_t* s, UString &dest)
if (++qcount == 3)
{
dest += L'"';
- qcount = 1;
+ qcount = 0;
}
}
f = s;
Just I'm still unsure which strategy is better/safer, probably must investigate in source code history of python/tcl (at least this two) or interview those community (or code originators).
@sebres - should I release a new version with that fix? What do you think about this?
I'm still unsure about the further amend (with qcount = 0
):
Although it is very improbable that someone would use triple-quotes as escape sequence for whatever invocation (it is rather \"
sequence, which would work correct now), it could be used anyway (for example to avoid insertion of unclosed quote-char by unpaired quotes in the argument). And this remains a certain inconsistence between default windows notation (if used for validation) and current implementation... Anyway it's important to have proper solution from scratch (first release implementing new parsing rules).
I had no time yet to inspect all the langs (source code history, tests, etc), why those parsing is implemented differently to windows, but at the moment it looks rather like this were historic inconsistence, not an intention. But I'd like to fulfill my research.
Can you wait few days with the release?
OK, regarding different behavior of script-langs, it looks indeed like a historic mistake.
In tcl
it was already fixed in newest versions. For python
I'd provide a PR later (as well as check whether some other langs are also affected in trunk/mainline).
Thus the correct decision would be to amend the fix as described above (provided in #310).
This PR provides a fix for certain vulnerability by command line parsing (considers valid windows escape sequences), like backslashes followed by quote-char as well as quote triples).
Windows command line processor (as well as programs and standard libraries constructing command line from arguments) use special escape and quoting sequences, so...
the proper scenario to parse command line looks like that...
- arguments are separated by spaces or tabs - quotes serve as optional argument delimiters `"a b"` --> `a b` - escaped quotes must be converted back to `"` `\"` --> `"` - consecutive backslashes preceding a quote (inclusive closing quote for a parameter with spaces) see their number halved with the remainder escaping the quote: 2n backslashes + quote --> n backslashes + quote as an argument delimiter 2n+1 backslashes + quote --> n backslashes + literal quote - backslashes that are not followed by a quote are copied literally: `a\b` --> `a\b` `a\\b` --> `a\\b` - in quoted strings, consecutive quotes see their number divided by three with the remainder modulo 3 deciding whether to close the string or not. Note that the opening quote must be counted in the consecutive quotes, that's the (1+) below: (1+) 3n quotes -> n quotes (1+) 3n+1 quotes -> n quotes plus closes the quoted string (1+) 3n+2 quotes -> n+1 quotes plus closes the quoted string - in unquoted strings, the first quote opens the quoted string and the remaining consecutive quotes follow the above rule.this causes following behavior by parsing of command line...
cmd line|arg1|arg2 ---|---|--- abc" "def|abc def| abc\\" \\"def|abc"|"def "abc\\" \\"def"|abc" "def| "abc"" ""def"|abc" "def| abc"" ""def|abc|def abc""" """def|abc" "def| abc\\""" \\"""def|abc"|"def abc\\\\""" \\\\"""def|abc\\" \\"def "abc"""def" ghi"|abc"def ghi| "abc"""def" ghi* missing close qoute at end|abc"def ghi| "abc""def" ghi|abc"def|ghi "abc\\"""def" ghi|abc""def|ghi "abc\\\\"""def" ghi"|abc\\"def ghi| "abc\\\\\\"""def" ghi|abc\\""def|ghi "abc\\\\\\\\"""def" ghi"|abc\\\\"def ghi|
I know Windows doesn't allow quote-char in file or folder names, but command line could be built by some automation with a foreign user input and if it becomes validated using nominal condition of Windows parser, but hereafter processed in 7zip, as a consequence it could cause:
|
, ampersand&
or redirecting tokens like>
that normally included in quoted string (and validated as a string) could abrupt get different meaning and used for piping, redirecting etc, that beside the injection possibility opens still worse attacking vector that can even cause RCE or used to create persistent exploits.Alternatively one could use windows API CommandLineToArgvW, but this would imply a modifying of
SplitCommandLine
interface as well as new dependency to shell32.dll.With this fix
SplitCommandLine
works now very similar toCommandLineToArgvW
(but doesn't require dependency toshell32
library).