janisozaur / include-what-you-use

Automatically exported from code.google.com/p/include-what-you-use
Other
0 stars 0 forks source link

getopt_long for Windows #52

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
In issue #20 I added support for getopt short options for Windows. In order to 
get the tests running under Windows I need support for the longopt options too.

I've found an implementation of getopt_long which seems suit our needs in 
PostgreSQL:

http://www.google.com/codesearch#NNmaNPDfY7U/src/port/getopt_long.c

I believe it has a compatible license:

http://wiki.postgresql.org/wiki/FAQ#What_is_the_license_of_PostgreSQL.3F

I've attached a patch which makes use of this implementation. Provided that 
there are no problems with the PostgreSQL license, would it be ok to move the 
Windows implementation over to this version?

Cheers,
Paul

Original issue reported on code.google.com by paul.hol...@gmail.com on 16 Jul 2011 at 2:31

Attachments:

GoogleCodeExporter commented 9 years ago
I can ask our lawyer-folk if the license is compatible, but wouldn't it be 
easier to just change the tests to always use the short form of the flags?  I 
have no problem with that.  Just throw in a comment saying why.

Original comment by csilv...@gmail.com on 18 Jul 2011 at 11:43

GoogleCodeExporter commented 9 years ago
Any more news on this?

Original comment by csilv...@gmail.com on 25 Oct 2011 at 1:40

GoogleCodeExporter commented 9 years ago
I've built an original implementation of getopt_long (that is, I haven't peeked 
at any source code, only specifications of getopt and getopt_long), that I'd 
like to donate to IWYU.

However, I'd like to retain rights to use this outside of LLVM; does anybody 
know how that works? If I submit a patch to LLVM under the LLVM license, is the 
code still mine to use in another project with conflicting license?

If not, can we figure out a way to license my getopt_long implementation 
differently?

Thanks,
- Kim

Original comment by kim.gras...@gmail.com on 1 Jul 2012 at 7:30

GoogleCodeExporter commented 9 years ago
I've looked at LLVM Developer Policy [1] and it says:

> As a contributor to the project, this means that you (or your company) retain 
ownership of the code you contribute, that it cannot be used in a way that 
contradicts the license (which is a liberal BSD-style license), and that the 
license for your contributions won’t change without your approval in the 
future.

So I think the same applies to IWYU. The only problem I see if conflicting 
license contradicts University of Illinois/NCSA Open Source License [2].

Craig already asked and I am curious too. Wouldn't it be easier to just change 
the tests to always use the short form of the flags?

[1] http://llvm.org/docs/DeveloperPolicy.html
[2] http://www.opensource.org/licenses/UoI-NCSA.php

Original comment by vsap...@gmail.com on 1 Jul 2012 at 11:25

GoogleCodeExporter commented 9 years ago
It looks like I can use these snippets in other places if need be; that's good 
enough for me.

Yes, it would be easier, but I personally prefer longopts to shortopts, and I 
think it would be useful for IWYU to support the same command-line style for 
all platforms.

I'll prepare a patch tonight based on my getopt code, so you can review it in 
place.

Original comment by kim.gras...@gmail.com on 2 Jul 2012 at 1:17

GoogleCodeExporter commented 9 years ago
Here's a patch to provide a drop-in replacement for getopt_long as used by IWYU.

This allows run_iwyu_tests.py to run all tests on Windows (they fail, of 
course, but that's another story).

I maintain unit tests of the getopt(_long) implementation separately; I don't 
see any plain unit tests for IWYU, so I didn't think it made sense to submit 
them. I'll post this as a separate library on github at some point.

Original comment by kim.gras...@gmail.com on 2 Jul 2012 at 7:49

Attachments:

GoogleCodeExporter commented 9 years ago
Have you taken an existing header or created the new one according to 
documentation? It might be better to stick closer to existing headers, but I 
don't insist. I've attached getopt.h provided with Mac OS X 10.6.8 as an 
example.

Functions' signatures differ from signatures in documentation. Specifically, 
argv should be "char * const argv[]".

Code style should be consistent. Specifically, should be "char* c;", not "char 
*c;". Also curly braces should be like in the rest of source code, i.e.
> void foo() {
> }
A lot of lines aren't limited to 80 characters length.

Probably we can put #include <stddef.h>, #include <string.h> inside #if 
defined(_MSC_VER). Though I am not sure about #include "iwyu_getopt.h". I don't 
know what would be better here, just ask you to consider these alternatives.

Sentences in comments should end with full stop. I don't understand what 
brackets mean in
> /* [I]f a character is followed by a colon, the option takes an argument */

Overall implementation looks correct, but optind value in various error cases 
looks suspicious. Also opterr looks unused. I need more time to compare patch's 
behavior with system implementation's behavior.

Original comment by vsap...@gmail.com on 29 Jul 2012 at 11:18

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for comments, here's an updated patch;

* Conforming declarations
* 80-character line length
* All comments end with a full stop
* K&R brace style

Original comment by kim.gras...@gmail.com on 30 Jul 2012 at 9:48

Attachments:

GoogleCodeExporter commented 9 years ago
I've tested the patch on Mac OS X, which is based on BSD, so here are BSD man 
pages for getopt[1] and getopt_long[2]. I'll refer to them in some cases.

If option was recognized, in your implementation optopt is equal to 0. 
According to BSD
> The variable optopt saves the last known option character returned by 
getopt().
I haven't found anything about this case at [3] and don't insist on 
implementing such behavior. Though if it doesn't complicate implementation, it 
might be worth implementing. I hope I'll test this case on Linux.

When required short option argument is missing:
- optarg is empty string (expected NULL);
- optind isn't changed (expected to be incremented by 2).
> If the option was the last character in the string pointed to by an element 
of argv, then optarg shall contain the next element of argv, and optind shall 
be incremented by 2. If the resulting value of optind is greater than argc, 
this indicates a missing option-argument, and getopt() shall return an error 
indication.

Resetting variables at the beginning of getopt() should be performed in 
getopt_long() too. For example, when call getopt_long with argv ["this"], there 
is a stale optarg from previous getopt() invocation.

If unknown long option is encountered, optind should behave like for short 
options, i.e. optind is incremented.

If short option with optional argument hasn't argument, optcursor points to the 
wrong memory (beyond argv[1], for example).

For long option with extraneous argument expect '?' to be returned.
> [...]'?' for an ambiguous match or an extraneous parameter

Also when required long option argument is missing, you always return ':'. I 
prefer such behavior because according to BSD
> These functions [getopt_long(), getopt_long_only()] return `:' if there was a 
missing option argument, `?' if
> the user specified an unknown or ambiguous option, and -1 when the argu-
> ment list has been exhausted.

But [3] says that
> Error and -1 returns are the same as for getopt()
though I am not sure if optstring should be considered for long options. I 
really need to check this case on Linux. By the way, Mac OS X doesn't treat 
missing required long option argument as an error.

[1] 
http://www.freebsd.org/cgi/man.cgi?query=getopt&sektion=3&apropos=0&manpath=Free
BSD+9.0-RELEASE
[2] 
http://www.freebsd.org/cgi/man.cgi?query=getopt_long&sektion=3&apropos=0&manpath
=FreeBSD+9.0-RELEASE
[3] http://www.kernel.org/doc/man-pages/online/pages/man3/getopt.3.html

Original comment by vsap...@gmail.com on 5 Aug 2012 at 3:03

GoogleCodeExporter commented 9 years ago
Thank you so much for your comprehensive review!

> If option was recognized, in your implementation optopt is equal to 0. 
> According to BSD
> > The variable optopt saves the last known option character returned 
> by getopt().

Done.

> When required short option argument is missing:
> - optarg is empty string (expected NULL);

As far as I've been able to tell, the contents of optarg aren't specified when 
getopt() or getopt_long() returns an error. But it seems nicer to be 
consistent, so: done.

> - optind isn't changed (expected to be incremented by 2).

Oops, done.

> Resetting variables at the beginning of getopt() should be performed 
> in getopt_long() too. For example, when call getopt_long with argv 
> ["this"], there is a stale optarg from previous getopt() invocation.

Again, optarg is probably only valid if getopt_long() returns a character 
matching an option with arguments. But it feels funky, so: done.

> If unknown long option is encountered, optind should behave like for 
> short options, i.e. optind is incremented.

Done.

> If short option with optional argument hasn't argument, optcursor 
> points to the wrong memory (beyond argv[1], for example).

Ouch. Done.

> For long option with extraneous argument expect '?' to be returned.
> > [...]'?' for an ambiguous match or an extraneous parameter
> 
> Also when required long option argument is missing, you always return 
> ':'. I prefer such behavior because according to BSD
> > These functions [getopt_long(), getopt_long_only()] return `:' if 
> there was a missing option argument, `?' if
> > the user specified an unknown or ambiguous option, and -1 when the 
> argu-
> > ment list has been exhausted.
> 
> But [3] says that
> > Error and -1 returns are the same as for getopt()
> though I am not sure if optstring should be considered for long 
> options. I really need to check this case on Linux. By the way, Mac OS 
> X doesn't treat missing required long option argument as an error.

I'm not sure what you're saying here... I think this;

> [...]'?' for an ambiguous match or an extraneous parameter

is badly phrased and means "ambiguous match or unknown option", it makes more 
sense that way.

Can you clarify what you're seeing and what you'd like instead? Thanks!

I'll integrate the changes into IWYU and post a new patch.

Original comment by kim.gras...@gmail.com on 5 Aug 2012 at 4:55

GoogleCodeExporter commented 9 years ago
Here's an updated patch with the following addressed;

* optopt always contains the latest option character
* optarg set to NULL when required argument for short/long options missing
* optind incremented by two when required argument for short/long options 
missing
* getopt_long() always resets optarg to NULL
* optind incremented if long option is unknown
* optcursor overrun fixed
* Added link from iwyu_getopt.cc back to GitHub project for getopt_port, where 
there's tests

This is still lacking any action on your last points, I'm not exactly sure what 
you mean. 

Original comment by kim.gras...@gmail.com on 5 Aug 2012 at 6:24

Attachments:

GoogleCodeExporter commented 9 years ago
Also, quick question -- my implementation allows long options to pick their 
arguments from the next argv, e.g.

  foo.exe --bar baz

If --bar takes arguments, baz will be considered one. It doesn't look like GNU 
getopt_long does this, could you check BSD?

Original comment by kim.gras...@gmail.com on 5 Aug 2012 at 6:44

GoogleCodeExporter commented 9 years ago
Mac OS X picks required argument from the next argv.

By
> For long option with extraneous argument expect '?' to be returned.
I mean
  foo.exe --bar=baz
when {"bar", no_argument, NULL, 'b'}.

Original comment by vsap...@gmail.com on 5 Aug 2012 at 7:56

GoogleCodeExporter commented 9 years ago
> I mean
>  foo.exe --bar=baz
> when {"bar", no_argument, NULL, 'b'}.

Ah, I see. This has the same behavior on my Ubuntu, as far as I can see, so 
added.

New patch attached with this fixed and comments edited. I let the link to 
FreeBSD spec overflow the 80-character limit, I thought it was more important 
that it was a valid URL, than to stick to the col width. Feel free to break it 
up if you disagree.

Thanks!

Original comment by kim.gras...@gmail.com on 6 Aug 2012 at 5:15

Attachments:

GoogleCodeExporter commented 9 years ago
When unknown long option is encountered, expect optopt to be 0. Because 
non-zero optopt means that unknown short option is encountered.

When we have `foo.exe --bar` and bar has an optional argument actual optind 
value is 3, Mac OS X implementation returns 2. I don't insist on value 2, 
because 2 and 3 are both beyond argv. But it might be a sign of other problems. 
Will look at this more thoroughly later.

For long option with extraneous argument your implementation sets longindex to 
corresponding option. Mac OS X implementation doesn't touch longindex. Again, 
don't insist on changes because man page doesn't specify longindex value in 
case of error. It is your decision, for IWYU both variants are OK.

These problems were uncovered by my existing test suite. Will check later if 
need to test more cases.

Original comment by vsap...@gmail.com on 7 Aug 2012 at 7:49

GoogleCodeExporter commented 9 years ago
Thank you for all the work on reviewing!

Updated;
* getopt_long resets optopt to zero
* getopt_long optional arguments are now only recognized on the form 
"--opt=arg" not "--opt arg", which fixes your optind comment

Hopefully this is now good enough for inclusion.

Thanks,
- Kim

Original comment by kim.gras...@gmail.com on 9 Aug 2012 at 8:14

Attachments:

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r369.

Original comment by vsap...@gmail.com on 11 Aug 2012 at 6:15

GoogleCodeExporter commented 9 years ago
Thanks for patch. Committed it in r369 with tiny changes. I've converted 
iwyu_getopt.cc from Latin 1 to UTF-8. Is your name displayed correctly on 
Windows now?

Original comment by vsap...@gmail.com on 11 Aug 2012 at 6:19

GoogleCodeExporter commented 9 years ago
Thank you! Yes, "Gräsman" has been correctly UTF8-encoded, thanks for that.

- Kim

Original comment by kim.gras...@gmail.com on 12 Aug 2012 at 3:02