Open rmunn opened 4 years ago
--help
and --version
to be based on Tokens instead of strings. (This allows us to ignore cases where --help
was a string value to a different option). And almost no tests needed changing except for the ones whose semantics explicitly change in #607. So this is probably not a breaking change after all.Lovely - this would help me a lot! I have an adjacent gotcha, which is the use of - in some utilities to mean stdin or stdout. For example, (cd /src/dir; tar -cf - .) | (cd /dest/dir; tar -xvf -)
emits the tarfile to stdout in the first command, and consumes it from stdout in the second.
The release version of this tool presently appears to parse -
as an empty option prefix. It shouldn't in the same situation as #607 fixes; a lone -
after an option that takes an argument should be the argument.
@Ozzard #607 also treats a bare -
as a value — see https://github.com/commandlineparser/commandline/pull/607/files#diff-c55127e12f4102753e3927ba25bfba42R59 — for precisely that reason. It's up to you to convert the -
into stdin or stdout as appropriate, but it will be a value and not an empty option.
Win! Any feel as to when we might see a new release with this in?
@moh-hassan Could I get a code review of #608, which solves this issue as well as #600?
@rmunn
Discussion 1.0:
getopt process commandline one option at a time in order. getopt is not aware if -h or --help
are used for displaying help, and can't enforce the caller program to use -h/--help
for displaying help.
The caller program (gzip in our case) call the func getopt
in a loop and process every option as a switch or scalar with value (only one value). Multi values are not supported.
The output of getopt is a vector with options at the first and values are next (also values can be mixed in between based on the mode of scanning).
getopt stop the processing of options and consider all the next are non-option values (even if start with -/--) a) when find -- b) when find a free value !!!
Let is show these corner cases:
# Example1:
$ gzip -S -- file --help
gzip output:
gzip: file: No such file or directory
gzip: --help: No such file or directory
gzip didn't display help because it didn't receive --help from getopt.
Why: gzip use -- as value for -S. find value 'file' and stop processing options, and consider all the next as values including --help. It didn't display help and didn't apply rule 4.2 for help.
# Example2:
$ gzip file -S -- --help
gzip output:
gzip: file: No such file or directory
gzip: -S: No such file or directory
gzip: --: No such file or directory
gzip: --help: No such file or directory
getopt find value at start and consider all the followed as values including -- and --help
# example 3:
gzip -S -a --help
gzip output:
gzip use -a (although it's option ) as a value for -S, find --help and display help note: CLP display error message missing values for both -S and -a and display help with errors.
getopt allow -- to be a value for scalar option although gnu standard didn't mention that -- can be used as a value.
The question: Is CLP Required to do this?
getopt didn't return --help and use it as a value and didn't apply gnu standard 4.2 The question: Is CLP Required to do this?
getopt allow an option (-a) to be a value for another option -S. The question: Is CLP Required to do this?
Also, getopt has three modes of scanning and they are completely different:
REQUIRE_ORDER, PERMUTE, RETURN_IN_ORDER
These modes are controlled by an environment variable POSIXLY_CORRECT
or + or - passed in front of the Short Option string.
-- and --help can be handled differently based on the active scanning mode that is left to the caller program.
It's a wisdom to be care in resolving the corner boundaries in using -- and --help and also following GNU standard with open minded.
Notes:
What is your suggestions based on the above behavior of getopt used by gzip?
References:
@moh-hassan - What version of gzip were you using when you ran those command lines in your comment? From which Linux distribution? (Or was it FreeBSD/OpenBSD/some other Unix that's not Linux?) Because the examples of gzip behavior that you're showing are not the same results that I get when I run it. Here's the output of gzip --version
on my system:
gzip 1.6
Copyright (C) 2007, 2010, 2011 Free Software Foundation, Inc.
Copyright (C) 1993 Jean-loup Gailly.
This is free software. You may redistribute copies of it under the terms of
the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.
There is NO WARRANTY, to the extent permitted by law.
Written by Jean-loup Gailly.
This is from the gzip
package, version 1.6-5ubuntu1, in Ubuntu Linux.
Below, in between my comments on what you wrote, I've also taken the examples of gzip's behavior that you give in your comment above, and run them myself. You'll see that the results I get when I run the exact same command are different from what you see when you run gzip, and are the same results that I designed PR #607 to achieve.
@rmunn Discussion 1.0: getopt process commandline one option at a time in order. getopt is not aware if
-h or --help
are used for displaying help, and can't enforce the caller program to use-h/--help
for displaying help.
Correct. The -h
/--help
is a convention, though a strongly recommended one in the GNU standard. Nevertheless, programs are free not to use -h
for --help
, which is why in #608 I made it optional (and defaulting to off) to have -h
as a shortname for --help
. I would recommend defaulting it to on at some point, because I personally find it annoying when someprog -h
doesn't display the help text, but I don't know how many people feel the same as me here.
The caller program (gzip in our case) call the func
getopt
in a loop and process every option as a switch or scalar with value (only one value). Multi values are not supported. The output of getopt is a vector with options at the first and values are next (also values can be mixed in between based on the mode of scanning).
Correct. Multi-values the way CLP does them (--someopt one two three --someotheropt
) where someopt
is an IEnumerable<string>
are an extension to the GNU standard, and I actually don't like them. But I'm not going to remove them from CLP; I'm trying to make a non-breaking change here. (More on that below).
getopt stop the processing of options and consider all the next are non-option values (even if start with -/--) a) when find -- b) when find a free value !!!
This isn't exactly right. getopt has three modes of operation; one of them does do what you describe (stop when it finds a free value), but that's only the default in three cases:
When it's called as getopt
rather than getopt_long
, so that long options (the --foo
ones) are not allowed. Note that almost every real example I could find uses getopt_long
, so this is not at all widely used in practice.
When the code calling getopt_long
requests this mode by putting a +
as the first character of the options string.
When the environment variable POSIXLY_CORRECT is set (which allows the end user to select this mode of operations if they want).
In all the real-world Linux software I've ever experienced as a user, however, getopt
is being called as getopt_long
, so that long options are allowed, and also options can be placed after values. I.e., your point b) here is correct in theory, but in practice nobody asks getopt to do that, and everybody wants options and values to be interspersed freely.
Let is show these corner cases:
# Example1: $ gzip -S -- file --help
gzip output:
gzip: file: No such file or directory gzip: --help: No such file or directory gzip didn't display help because it didn't receive --help from getopt.
What I get when I run gzip -S -- file --help
is:
Usage: gzip [OPTION]... [FILE]...
Compress or uncompress FILEs (by default, compress FILES in-place).
Mandatory arguments to long options are mandatory for short options too.
-c, --stdout write on standard output, keep original files unchanged
-d, --decompress decompress
-f, --force force overwrite of output file and compress links
-h, --help give this help
-k, --keep keep (don't delete) input files
-l, --list list compressed file contents
-L, --license display software license
-n, --no-name do not save or restore the original name and time stamp
-N, --name save or restore the original name and time stamp
-q, --quiet suppress all warnings
-r, --recursive operate recursively on directories
-S, --suffix=SUF use suffix SUF on compressed files
-t, --test test compressed file integrity
-v, --verbose verbose mode
-V, --version display version number
-1, --fast compress faster
-9, --best compress better
--rsyncable Make rsync-friendly archive
With no FILE, or when FILE is -, read standard input.
Report bugs to <bug-gzip@gnu.org>.
And the exit code is 0. The above is the same text that gzip prints as a result of gzip --help
. For the sake of keeping this comment as short as I can, I'm going to summarize this text as (help text)
in all future examples.
Why: gzip use -- as value for -S. find value 'file' and stop processing options, and consider all the next as values including --help. It didn't display help and didn't apply rule 4.2 for help.
In the gzip version on my system, gzip uses --
as the value for -S
, finds the bare value file
and does not stop processing options, so it then finds --help
and prints the help text, exiting with a 0 exit code since printing the help text is a non-error situation.
# Example2: $ gzip file -S -- --help
gzip output:
gzip: file: No such file or directory gzip: -S: No such file or directory gzip: --: No such file or directory gzip: --help: No such file or directory
What I got when I ran gzip file -S -- --help
:
(help text)
and exit code 0.
getopt find value at start and consider all the followed as values including -- and --help
Again, the version of gzip that I ran did not stop at the first value. So -S
was handled as an option and then --
was treated as the argument to -S
, so that --help
was still processed as an option and printed the help text.
# example 3: gzip -S -a --help
gzip output:
gzip use -a (although it's option ) as a value for -S, find --help and display help
Here I get the same behavior as you: gzip does not consider -a
to be an option because it immediately follows -S
, which means that -a
is never processed. But then it encounters --help
, so according to the GNU coding standards it ignores all other arguments and prints the help text.
note: CLP display error message missing values for both -S and -a and display help with errors.
This is against the GNU coding standards I just linked to one paragraph above, which say: "Other options and arguments should be ignored once this" (that is, the --help
option) "is seen, and the program should not perform its normal function."
getopt allow -- to be a value for scalar option although gnu standard didn't mention that -- can be used as a value.
The question: Is CLP Required to do this?
I believe it is, because we are trying to mimic the behavior of getopt. The GNU coding standards don't mention anything about what valid values can be for an option, because that's not something that the programmer using the getopt
library (the person for whom the GNU coding standards document was written) needs to care about. The GNU coding standards just say "use getopt_long
", which means that your program will get all of getopt
's normal behavior. Including the fact that the next argument after a value-taking option like -S
, whatever it is, should be swallowed whole and not interpreted. That's what getopt does, and that's the behavior that CLP should mimic.
getopt didn't return --help and use it as a value and didn't apply gnu standard 4.2 The question: Is CLP Required to do this?
Same response as the paragraph above. Yes, CLP should follow this behavior, because that's what getopt is expected to do. After a value-taking option, the next argument should be treated as the value, no matter what it is.
getopt allow an option (-a) to be a value for another option -S. The question: Is CLP Required to do this?
Same response as the paragraph above. Yes, CLP should follow this behavior, because that's what getopt is expected to do. After a value-taking option, the next argument should be treated as the value, no matter what it is.
Also, getopt has three modes of scanning and they are completely different:
REQUIRE_ORDER, PERMUTE, RETURN_IN_ORDER
These modes are controlled by an environment variable
POSIXLY_CORRECT
or + or - passed in front of the Short Option string. -- and --help can be handled differently based on the active scanning mode that is left to the caller program.It's a wisdom to be care in resolving the corner boundaries in using -- and --help and also following GNU standard with open minded.
Almost every program I've seen uses the PERMUTE option (which is the default of getopt_long
), and the GNU standards say "Use getopt_long
to decode arguments, unless the argument syntax makes this unreasonable." So we should definitely default to this behavior, allowing options and values to be mixed just like the default behavior of getopt_long
does.
Notes:
1. Considering - (single dash) as a value was one of the missed feature needed by developers and can be implemented and didn't conflict with GNU standard or getopt corner cases.
Yes, that could be implemented separately from my PR, quite easily. I fixed it in my PR because it was very easy to do, but if you want to reject my PR, then allowing -
as a value should still be done.
2. AutoHelp=false, give the freedom for developer to provide his helptext and not/use -h/--help for displaying help.
My PR honors AutoHelp=false. Or at least, it should; if there's any part of my code that fails to honor AutoHelp=false, that's a bug and I'll fix it.
3. If it's allowed to use -- as a value it should be declared as a setting in ParserSetting although it can passed as """--""" without change.
I don't understand what you mean by "it can passed as """--""" without change", so I'll have to skip commenting on that part of this point. As for the rest, using --
as a value after an option (or after one occurrence of --
treated specially) is the normal behavior of getopt_long
, just as with any other value that starts with -
, or indeed any text whatsoever. Since the normal behavior of CLP is intended to mimic getopt as closely as possible, I don't think it should be a setting in ParserSetting to follow getopt's normal behavior. (I'd actually like to make the EnableDashDash option the default, so that getopt is mimiced by default, but that would be a breaking change so it should be reserved for a release that does a major-version bump, e.g. version 3 of CLP).
4. If it's allowed to make an option to be a value like example 3, it should be avoided.
I would VERY strongly disagree. The getopt behavior is that any text (no matter what it is) that follows after a value-consuming option should be consumed. If it didn't work that way, then there are two scenarios that would be impossible or very difficult:
I'm writing a program that runs another program, and I have an --extra-args
option that the user can pass to tell me extra arguments for the other program. E.g., outerprog --verbose
would call innerprog --foo
, but outerprog --extra-args --bar --verbose
would call innerprog --foo --bar
. (Note that in this example, --bar
is not a valid option to outerprog
). Without the ability to use any text as the value to a string-consuming option, the end user calling outerprog
would be puzzled why the --extra-args
option wasn't working right. If your note 3 was implemented, the user could add a --
before the --bar
like outerprog --extra-args -- --bar --verbose
(or, wait, then --verbose
would be a value so the user would have to rewrite that as outerprog --verbose --extra-args -- --bar
-- which shows another problem with your note 3, because the whole point of getopt_long
's default behavior of interspersing values and options is that users should not have to rewrite the order of their command-line options to satisfy the demands of the program). But what would you expect that particular command, outerprog --verbose --extra-args -- --bar
, to do? What I think you'd expect that to do is that --bar
would become a value to --extra-args
. But that's not what I (as someone who's used Linux since 1998) would have come to expect. I'd expect that the --
would be swallowed as the value of --extra-args
. And even if it wasn't, I'd expect that --bar
would be treated not as a value of an option, but as an extra argument to the program (what CLP calls a Value). And I'm not the only one who would expect --
to stop processing option values; #605 is based on that expectation as well.
The other scenario is as follows. Let's say that --bar
was not a valid option to outerprog
, and we're following the suggestion in your note 4. So outerprog --extra-args --bar
sees --bar
, sees that it's not a valid option, and treats it as the value of --extra-args
. But now the developer of outerprog
adds a --bar
option in a new release. Suddenly the same outerprog --extra-args --bar
command line that used to work (and pass --bar
to innerprog
) is now failing with an error, saying that --extra-args
needs a value. The end user will be baffled by this change in behavior. "What?" they'll say. "I'm already passing it a value: --bar
is the value!" The fact that --bar
became a valid option to outerprogs
will not make them expect that it would no longer be treated as a valid value after --extra-args
. So by following getopt_long's normal behavior, i.e. swallowing the next argument as an option value no matter what it is, we achieve consistency between versions 1 and 2 of outerprog
, because the treatment of --extra-args --bar
will be the same no matter whether --bar
is now a valid option to outerprog
or not.
What is your suggestions based on the above behavior of getopt used by gzip?
My suggestion is to mimic the default behavior of getopt_long
exactly, because that's the one that's used in every Linux command-line program I've ever seen. (Except, of course, for programs like msbuild
and dotnet
which follow Windows command-line behavior and not Linux command-line behavior, but for that reason I don't consider them to be "Linux command-line programs"; they are Windows command-line programs that were ported to Linux).
Note that the examples you gave above do not, except for example 3, follow the behavior of getopt_long, so I'm very, VERY curious to know which Linux distro those examples came from, and what the output of gzip --version
is on your system.
That means that:
-S -- --help
should have --
be the value of the -S
option, and print help text unless AutoHelp is false.-S -a --help
should have -a
be the value of the -S
option, and print help text unless AutoHelp is false. -S --help
should have --help
be the value of the -S
option, and NOT print help text.References:
* The Open Group Base Specifications Issue 7, 2018 edition, IEEE Std 1003.1-2017 [Revision of IEEE Std 1003.1-2008](https://pubs.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap12.html)
This is the POSIX standard, which is what getopt
follows if you use the POSIXLY_CORRECT environment variable. Modern Linux programs all tend to follow the GNU standard, though, which takes the POSIX standard and expands on it.
* Linux Programmer's Manual [GETOPT(3)](https://www.man7.org/linux/man-pages/man3/getopt.3.html)
This actually says that the default behavior of getopt
, not just getopt_long
, is to "permute[] the contents of argv
as it scans, so that eventually all the nonoptions are at the end." This is the behavior I was expecting, where options and values can be mixed and processing doesn't stop at the first non-option value. So that's actually different than the getopt
source code here, which is what I had been looking at as I wrote the earlier parts of this.
* gzip source in using getopt[L433](https://github.com/Distrotech/gzip/blob/distrotech-gzip/gzip.c#L443)
I see this is calling getopt_long
and not passing a +
as the first character, so the default behavior should have been what I was expecting, and NOT the behavior you got in your examples 1 and 2 above. Which, again, makes me wonder what version of gzip (and what version of Linux) you were running when you got those results.
Sorry about how long this was. The summary is: AFAICT, the version of gzip that you used to produce those examples is buggy, and the way getopt
is supposed to work is to allow mixed options and values, and to "swallow" anything after a string-taking argument (even another option or the text --
). I.e., exactly the way I wrote PR #607 to behave.
@moh-hassan - I really do want to know what version of gzip you were using when you ran those command lines in your comment, and which Linux distribution it came from. Because if there are Linux distros out there whose standard tools do not allow interleaving options and values the way my gzip examples do (i.e., they stop processing options after the first value is encountered), then I should change the defaults on my PR. So if you got those examples from running a real gzip command, please let me know how to reproduce those results. (If you got them from reading the gzip source code and thinking that's how it would work, then I suspect you made a mistake about the defaults and that if you ran a real gzip command you would get the same results as me.) If I can reproduce the gzip results you got, I'll be better placed to judge whether my PR needs to change.
@rmunn
What version of gzip were you using when you ran those command lines in your comment?
This is from the gzip package, version 1.6, in Ubuntu 18.04.2 Linux.
With enabling POSIXLY_CORRECT
user1@ubuntu:~$ gzip --version
gzip 1.6
Copyright (C) 2007, 2010, 2011 Free Software Foundation, Inc.
Copyright (C) 1993 Jean-loup Gailly.
This is free software. You may redistribute copies of it under the terms of
the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.
There is NO WARRANTY, to the extent permitted by law.
Written by Jean-loup Gailly.
The summary is: AFAICT, the version of gzip that you used to produce those examples is buggy,
NO, it's absolutely correct, but gzip has no control at all to disable POSIXLY_CORRECT
Getopt(getopt_long) IGNORE VALIDATION of the next token, but CLP apply guards and validation (more than 16 validation rule) based on the Option class (including value and data type).
It's not logic to take the primitive behavior of getopt_xxx and enforce CLP to take the next token blindly.
See this example
gzip -S -b file.txt
It will generate a file: file.txt-b
.
Is it logic to take the entry of commandline ASIS without validation?
getoptxxx treat commandline with good intent and suppose that user will not do a mistake, and if he did a mistake, he should bear it even the file can b named: file.txt-b
as in the examble above.
CLP validate the token and fire error and it is controlled by parser setting .
For help, getoptxxx is not aware of Verbs and how to call help in verb scenario.
If you want to trigger help with --help, -h from any position(as getopt mimiced), this can be done with a minor change in HelpText class using Regex, but again take the syntax of help verbs into account.
(If you got them from reading the gzip source code and thinking that's how it would work, then I suspect you made a mistake about the defaults and that if you ran a real gzip command you would get the same results as me.)
Can you imagine that I can do this mistake and imagine output based on source code reading? if you enabled POSIXLY_CORRECT , you ran a real gzip command you would get the same results as me.
Summary CLP can mimic getopt in help with minor change in HelpText class, and getopt is not a ware of help for verbs or using verbs. For --, CLP control it with EnableDashDash =true/false option in parser setting plus other options that control what to do with the next token and do validation and apply guard rules on every token.
The goal of this library is not to mimic the behavior of getopt, but to apply GNU standard for using short/long options (vs forward slash) with controlling parser behavior and apply validation rules on tokens and extra features.
Ah, so you were using POSIXLY_CORRECT in those options. I didn't understand that, since you didn't show it in the grep
command lines you posted, and I don't know anyone who has it set by default in their .bashrc
because the default getopt behavior is so much more useful than the POSIX standard behavior.
And you're arguing that CLP should mimic the POSIX standard by default, whereas I'm arguing that it should mimic getopt's default (non-POSIX) behavior by default.
Actually, it will be pretty easy to allow both; I'll tweak PR #607 to add a ParserSettings option called PosixlyCorrect that turns on the POSIX behavior (stop processing optons after first non-option argument), and I'll also make it honor the POSIXLY_CORRECT environment variable so that end users who expect that behavior can make it happen. (And after doing a bit of Googling on the subject myself, I've come to the conclusion that sometimes POSIXLY_CORRECT is what you want, but most of the time it's not since most people write Bash scripts with the assumption that getopt's default mixed-options-and-values behavior is what they're going to get. So allowing for both behaviors is definitely the right thing to do here. I'll leave it defaulting to mixed, since it seems that that's what most people expect, but there will be a ParserSettings option to change that (like putting a +
in front of the options string of getopt).
As for the question of validation of option values, I am firmly convinced that CLP should do exactly as much validation as is needed to validate the types of the options, and nothing more. I.e., if -s
is a string option and -n
is a number (say an int) option, then -n foo
should be rejected, but -n -1
should be accepted and put the value -1 (negative one) into the Number property. And -s foo
should be accepted, and so should -s -1
, because CLP cannot know the end user's intent. What if the end user preferred having tarballs with a .tar-gz
extension instead of .tar.gz
? If getopt worked the way CLP currently does, gzip -S -gz file.tar
would throw an error, instead of producing the file.tar-gz
file that the user wanted. But since opinion clearly does differ on this subject, I'll put in another ParserSettings option to change that, and forbid string values starting with a -
(except for the bare -
value which means "stdin/stdout", and should always be allowable as a string value). I have a feeling that most people will want to permit string values that start with -
, so I think that most people will want to turn that particular option off, but in deference to CLP's current behavior I'll default that one to on so that the "no options that start with -
" validation is kept by default.
AFAICT, the changes I made to the parser don't change the validation of ints or other types: -n foo
will still produce a parser error when it tries to convert "foo" to an integer. So I only really need to care about this for string values, because integer values in particular need to be able to allow -1
and the like.
The goal of this library is to mimic the behavior of getopt, but there are a few corner cases where this library behaves differently than getopt would: in the handling of
--
or--help
when they are the value of a string parameter.How getopt behaves
First, an illustration of how getopt works with the particular corner case I'm demonstrating. Let's look at the standard
gzip
andgunzip
tools found with any Linux distribution. They take many options, but one of them is--suffix
(or-S
for short); this lets you specify a different suffix than the standard.gz
for the compressed file. E.g. if you have a README.md file in the current directory, thengzip -S .compressed README.md
will create a README.md.compressed file instead of README.md.gz.Now, what do you think will happen if I run this command?
The correct answer is that it will create a compressed file named
README.md--
in the current directory. Because the string--
was specified immediately after an option that takes a string value, it was processed as the value for that option (the--suffix
option), and so gzip created a file with a--
suffix instead of.gz
. Now look at these three examples:What do you think these will do? Answer:
--help
in the current directory, and create a file named--help.gz
.--help
in the current directory, and create a file named--help--
.Why did
gzip -S -- --help
print the help text? Because--
was the value for the-S
option, and so it was not treated as the "stop processing options now" marker. Then after the-S
option was fully processed, the only remaining options were--help
. Since--help
was encountered, gzip displayed the help screen and did nothing else.With the
gzip -S -- -- --help
line, OTOH, the first--
became the value for the-S
option. Then the second--
was processed as an option, and had the "stop processing options now" meaning. So the--help
text was treated as a value, and so it looked for a file named--help
to compress. And since I specified that the compressed suffix should be--
, the compressed file was named--help--
.What CommandLine does
The current way CommandLine works is to call a preprocessor function to look for any
--
options and, if found, mark anything found after them as a value. But this would mean that in thegzip -S -- --help
example, where the correct getopt-mimicing behavior would be to print the help text, CommandLine will instead return an error saying that-S
needed a value and didn't get one.This corner case actually shows a fundamental difference between the behavior of CommandLine and the behavior of getopt. CommandLine uses a tokenizer to parse the command-line arguments and decide, based on the presence of
-
or--
at the front, to treat them as Name tokens or Value tokens. But if you read thegetopt
source code and figure out what it's actually doing, it's parsing one argument at a time, deciding whether that argument needs a value, and then if a value is needed, it swallows the next argument without further processing. Which is why you can pass--
as the suffix in gzip, and it will happily accept that.What CommandLine should do
The tokenizer, instead of processing all the arguments at once and deciding whether they're names or values, should process each argument one at a time. Then the decision tree should look like:
--
and EnableDashDash is true? Then stop processing; the rest of the arguments are all values.--
and EnableDashDash is false? Then it is the value--
; continue processing the next argument.--
and contain an equals sign? Then split it into two tokens, the part before the=
is the name, and the part after the equals is the value. (Split at the first equals sign; any equals signs after that point would become part of the value).--
and not contain an equals sign? Then we look at the list of option longnames that the tokenizer was given:AllowMultiple=true
: this is a name token. Resume tokenizing with the next argument (it is NOT swallowed). (This allows for things like-v
or--verbose
to be passed multiple times, like-vvv
, which the parser will turn into Verbose=3 in the final options instance.)-
and contain only letters that match shortnames? Split it into multiple shortnames. (I.e.,-lR
would becomeName("l"), Name("R")
if there are both-l
and-R
options).-
and its first letter matches a shortname, but the rest does not? Split it into first letter & rest, and that's two tokens: Name(first letter) and Value(rest).-
and have only one letter? Then it's a shortname, and we look at the type of the option with that shortname:Conclusion
Unfortunately, if the goal of getopt compatibility is to be achieved, a big rewrite of the guts of CommandLine's tokenizer and parser will be needed, so this is a big job. But if we want to mimic the behavior of getopt, then that's what will be needed. And the behavior I described above is how getopt works.
Also unfortunately, this is probably going to be a breaking change, so it might end up requiring a 3.0 version number. Because some people might be very surprised when
--stringoption --booloption
ends up being parsed with--booloption
as the string value of--stringoption
; they would probably have come to expect that to produce aMissingValueOptionError
for--stringoption
. But surprise or not, the correct way to handle that is for--booloption
to be the string value of--stringoption
in that example.