c-blake / cligen

Nim library to infer/generate command-line-interfaces / option / argument parsing; Docs at
https://c-blake.github.io/cligen/
ISC License
509 stars 24 forks source link

Fix GcUnsafe warnings #106

Closed pb-cdunn closed 5 years ago

pb-cdunn commented 5 years ago

Fir GcUnsafe warnings by avoiding the clCfg global. Because it has a seq, it uses GC.

The tests use the global clCfg (for now), but nothing else should.

resolves #4

pb-cdunn commented 5 years ago

I was just looking at #105. Hmmm. I see the problem. clCfg is meant to be a global variable modified by the user.

I'm not sure that's a good model. What if the default ClCfg value is nil, and defaultClCfg() is called whenever needed?

proc dispatch(..., cfg: ClCfg = nil) =
  if cfg == nil:
    cfg = defaultClCfg()
  dispatchGen(cfg, ...)
c-blake commented 5 years ago

First, I am happy to take whatever pragma measures to silence the warning.

Personally, I think this is the way to go, as I believe this to be a spurious/false positive warning. It's issued when nothing is modifying or reading the variable is happening from multiple threads and even if the global var is changed to let. Such a warning issued even with no modifications to a global variable with the same lifetime as the program and even with --threads:off..come on. Araq's answer would probably be "Wait for owned and then everything will be better". Another answer is that a "warning is not an error, doubly so for a verbosity:2 warning".

That said, I was able to silence the warning doing this:

import cligen
proc demo() = discard
proc foo()=
  var cf = clCfg
  cf.version = "1.5"
  dispatch(demo, cmdName="Version", cf=cf)
{.push warning[GCUnsafe]: off.}
foo()
{.pop.}

I do realize that's a high complexity burden to impose on verbosity:2 CLI authors like you to get a clean compile, but on the other hand such authors might be used to jumping through such hoops for clean compiles.

Any alternative seems to be impose a somewhat lesser burden on verbosity:<2 CLI authors to get any working compile at all while adjusting program global settings. Rather than a defaultClCfg as this PR does, it does seem to work to have a const clCflg = ClCfg(..) and make the CLI author say:

proc foo() =
  var cf = clCfg; cf.version = "1.5"
  dispatch(..., cf=cf)
foo()

or in a dispatchMulti context repeat that cf=cf keyword argument on all subcommand dispatch bracket lists. It seems like the current proposal of this PR would be to have dispatchMulti people wanting to set a program global setting like version define their own proc defaultClCfg between import cligen and dispatchMulti?

So, I'm not sure what move is best here. I like to keep the "for dummies" usage mode of this stuff syntactically simple. So, to be honest I think keeping it as is and explaining how to silence the warning may be the most balanced answer across my admittedly small known&noisy user base. I think we're going to get Nim-0.20 by the end of Summer and the whole story may change with owned. I could see that making a particularly large difference for whole program lifetime global variables.

c-blake commented 5 years ago

Another idea might be to put cligen/oldAPI.nim:dispatch2 inside a function and have it call said function with the hint[GcUnsafe]:off as above and suggest verbosity:2 folks use dispatch2 until Nim-0.20? I think that could be made to work.

c-blake commented 5 years ago

This also compiles cleanly at verbosity:2 with no pragmas:

import cligen
proc demo() = discard
proc defCall(cf: ClCfg) =
  var c = cf
  c.version = "1.5"
  dispatch(demo, cmdName="Version", cf=c)
defCall(clCfg)

This is also what cligen/oldAPI.nim:dispatch2() now does after my most recent push. So, "just use dispatch2" is now another answer if you find the above construction onerous. I can release a new version soon if you don't like depending on cligen#HEAD.

EDIT: fixed spelling; Also if you're ok with this solution feel free to close the PR

c-blake commented 5 years ago

Oh, and one might be able to make a wrapper template called like this:

dispatch3(c, demo, help={...}):
  c.version = "1.5"

that would expand to the above. I'm not sure I'd want to make people not setting any global ClCfg settings have to do the :discard part of dispatch3(procnm): discard, but there may be some way around that, too.

pb-cdunn commented 5 years ago

Using your latest code (7c36425):

pb.nim(12, 11) template/generic instantiation of `dispatch` from here
../repos/cligen/cligen.nim(598, 14) template/generic instantiation of `dispatchGen` from here
../repos/cligen/cligen.nim(538, 13) Warning: not GC-safe: 'parser(cmdline)' [GcUnsafe]
        parser()
              ^
../repos/cligen/cligen.nim(589, 31) Warning: not GC-safe: 'dispatchdataset(mergeParams([""], commandLineParams()), clUsage, "", false)' [GcUnsafe]
    quote do: cligenQuit(`disNm`(mergeParams(`mergeNms`, `cmdLine`)),
                                ^

This happens from:

proc dataset(extras: seq[string]) =
  echo "pb dataset"

when isMainModule:
  import cligen
  dispatch(
        dataset, help = {},
  )

First, I am happy to take whatever pragma measures to silence the warning.

I tried that in several ways, with no success. Of course the following works:

import cligen
proc demo() = discard
proc foo()=
  var cf = clCfg
  cf.version = "1.5"
  dispatch(demo, cmdName="Version", cf=cf)
{.push warning[GCUnsafe]: off.}
foo()
{.pop.}

But then I've suppressed a useful warning for my code!

I have to disagree with your reasoning. This isn't simply the existence of a global variable and the compiler not being sure whether it's used. The problem is that your global variable has garbage-collected fields, so the compiler cannot be sure when garbage-collection occurs. In my opinion, this is your bug. In general, you should not use global variables, and in particular you should not use garbage-collected global variables. "Let's wait for the compiler to be re-written" is not a good solution.

I like to keep the "for dummies" usage mode of this stuff syntactically simple.

I completely agree with that, but I am not doing anything complicated! I do not want to override the default config, and in my opinion only people who do override the default should need to do anything special. So the answer to me is obvious:

I don't know how to explain this better. Unfortunately, when I make ClCfg a ref object and do this:

type
  ClCfg* = ref object
    ...

template dispatch*(pro: typed{nkSym}, cmdName="", doc="", help: typed={},
 short:typed={},usage=clUsage, cf:ClCfg = nil,

I get this weird error:

pb.nim(12, 11) template/generic instantiation of `dispatch` from here
../repos/cligen/cligen.nim(623, 43) Error: 'nil' cannot be assigned to
   short:typed={},usage=clUsage, cf:ClCfg = nil, echoResult=false,noAutoEcho=false,
                                            ^

Any idea what that means? I cannot reproduce it with a simple template. Maybe I have a typo?

type
  Bar = ref object
    z: int

template foo(x: int=8, y:Bar=nil) =
  echo "HI"

foo()

That works fine, so I don't understand "nil cannot be assigned to".

pb-cdunn commented 5 years ago

Oh, I see:

type
  Bar = ref object
    z: int
var gg = Bar(z:5)
template foo(x: int=8, y:Bar=nil) =
  y = gg
  echo "HI"

foo()
foo.nim(9, 4) template/generic instantiation of `foo` from here
foo.nim(5, 30) Error: 'nil' cannot be assigned to

So we cannot use "nil" as a default to the template. Hmmm....

pb-cdunn commented 5 years ago

I guess the standard dispatch() should not accept a ClCfg argument at all. Another template can be used by folks who need to provide their own ClCfg. That can avoid the need for a global clCfg.

c-blake commented 5 years ago

I never use ref. So, I'm not sure what was going on there, but not being able to have a nil ref does make some sense.

so the compiler cannot be sure when garbage-collection occurs. In my opinion, this is your bug. In general, you should not use global variables, and in particular you should not use garbage-collected global variables.

You may want to focus on where we agree further down, as I am not saying you can't have what you want. That said, I have two rebuttals.

I don't see how the indeterminacy garbage collection timing creates any real bug here. Maybe you could be more specific? You're calling it my bug with italics - so, I'm trying to understand what that bug might be. I fundamentally disagree with the idea that "any warning is a bug". I consider that idea far too naive about the limitations of warning systems. Do you think my code accesses stale values after being released? If it were multi-threaded then maybe that could happen, but my code is not MT and this warning manifests even without --threads:on. Maybe this is about the "order of destruction of global objects" at program termination time? That shouldn't matter since none of my code accesses that stuff after dispatching to CLI author code. Even if dispatch didn't invoke cligenQuit and quit/echo result (never to look at cf again) and some client code just did dispatchGen in a loop constantly changing clCfg, I still don't see any use after free problem, or even uncollectable garbage generation problem. Below I exhibit example code that does not warn that uses parameter passing rather than let to declare "const from this point onward".

I also disagree that one "should" not use global variables to specify global, life-of-program configuration for the inherently single-threaded "entry" to a program. To me, that's what globals are for, if they're in the programming language at all. This clCfg global and all it contains is, in effect, read-only static data just as soon as the user is done initializing it, before any entry into non-CLI author code and the CLI-part is single-threaded. Read-only global data after setup before multi-threading is AOk to me. That's all this is. If we could mark it "const from here on out" with a let (or even mprotect() it read-only) that would be fine with me. The analyzer fails there, too, though. Maybe current Nim cannot figure it out, but that doesn't mean the condition of the world isn't sanitary.

On to stuff we disagree about less..It is true that to silence this warning (what I am calling "a false positive due to reasonable use of a global" until you persuade me otherwise) the CLI author needs to do something special even if they don't adjust any settings. I agree that is undesirable (though I also think a better Nim analyzer is even more desirable than this is undesirable).

So, maybe until we have a better memory analysis we need a dispatchCf and the main dispatch should do the proc wrapping automagically. I get a no warning compile even with --threads:on for this:

import cligen         #No warning but we do update `clCfg`.
clCfg.version = "hi"  #So, either GC analyzer has false neg
proc demo() = discard #..here OR `let` version has false pos.
proc defCall(cf: ClCfg) = dispatch(demo, cf=cf)
defCall(clCfg)

and --version even spits out hi to boot!

Also, I'm pretty sure (you could double check the Nim manual if you want) that the above version is at least supposed to be semantically identical to let cpy=clCfg; dispatch(cf=cpy). However, if you try that let syntax you still get the GC warning. Maybe dispatch being is a template, not a proc, makes for weaker promises, though said template never ever alters cf -- cf:ClCfg=clCfg does not need to be cf: var ClCfg=clCfg, even if somehow being a template makes write access possible in principle.

Anyway, with this code pair example (ok, I left chaning to let up to the reader), maybe I've convinced you it really is a false positive warning and we have only a semantic disagreement about whether a falsely unclean compile constitutes "a 'bug' of cligen's" as opposed to a buggy warning. The proc form just makes it easier for the compiler to prove to itself there's no illicit access (however it is trying to do that) than the let form. Even putting the whole thing inside a proc with the let fails. I think there's just something lacking in that GC analysis (something lacking that I completely agree you have good reason to want, but I'm not the person to ask for it).

Anyway, in light of the above code snippet compiling cleanly for me (on nim-devel, nim-0.19.2, 0.19.4, 0.19.6 in both C and C++ backends), we could rename dispatch -> dispatchCf and create a new dispatch that takes all the same params except cf and expands to like the above (but doesn't import, tweak clCfg, or define demo, obv.) and passes all the params through with the cf=cf.

I think people just setting clCfg wouldn't even notice any changes, only cf=myCfg passers would . Such passers A) have to be very new since that only just appeared and B) would now have to switch to dispatchCf. As far as I know, besides me @kaushalmodi and @ SolitudeSF are the only users of version. They just set them through the global var clCfg. From the above, that looks like it would continue to work and they wouldn't have to do anything. If the GC analysis ever works better someday, then we can ditch the wrapper and alias template dispatchCf = dispatch forever.

Does that sound acceptable? It seems pretty close to just an expanded/almost tested version of your proposal.

pb-cdunn commented 5 years ago

I'm only saying it's your bug because you cannot control the garbage-collector. Even though you are not actually using the global, a problem could occur at program exit, for example, in theory. I'm just trusting the compiler warning. But you're right; you're not actually using the global dangerously, just to be clear.

Reading your solution now ...

pb-cdunn commented 5 years ago

Yes, I think all that makes sense.

c-blake commented 5 years ago

Ok. Apologies if I was too defensive about the bug claim.

Just to clear up some things because you seem to have some confusion, clCfg is not a ref - it is an allocated object. It cannot be nil itself. nil has also become an invalid value for any of its fields...string, seq, set[char]. I don't know that the struct holding clCfg ever gets collected or even really managed in a GC way. That struct could be allocated by the C compiler into a global then allocated by the linker in a segment of the executable, where ultimately the OS reclaims the pages (hyper efficiently whole-address-space-at-once-even-if-program-was-swapped-out).

Some fields of clCfg might be. E.g. the empty string "" for version should get eventually released after someone sets a new version. Whenever the new value of version gets collected, it's certainly after cligen code is done with the field. It should be in the equivalent of "post main() clean-up"...If you've ever used atexit handlers in C or destructors on global objects in C++. So the new version should be uncollected until not only after cligen references it, but even until after any code cligen dispatches to references it.

Part of why I did a global was so that cligen clients could do:

cfCfg.version = "whatever"
def version(): echo cfCfg.version
dispatchMulti(...., [version])

and then have every subcommand get --version and get a version subcommand all just referencing back to the same string easily.

I'm even considering a new clCfg.versionAuto variable (by default set to true, even!) that will do that automatically whenever cf.version != ""..essentially giving a generated version subcommand like the current generated help subcommand.

Could I have done it without globals? Sure, but this really seems to me a textbook case of what globals are for. Set/changed once, up-front (time-wise), then read-only afterwards they make some things very easy and even cleaner to me. Now dynamic global data in multi-threaded programs - that can create hard to reason about situations which should not be done unnecessarily. That really is different, though.

I understand that there's this "compiler knows better than I do" mentality out there in the world, but I think, at least for the present nim compiler Re: GC warnings about globals, that is just false hope. I'm not trying to squash your hope more generally or even that hope permanently. :-) I think we both have high hopes for owned maybe making things like the let analysis succeed for this example the way the proc parameter passing analysis does already.

Anyway, I understand that using

proc f(cf: ClCfg) = dispatch(TheProc, cf=cf, otherParams..)
f(clCfg)

instead of dispatch(TheProc, otherParams...) just to silence warnings is a hassle and I will try to fix that. But if you're in a rush you can do that right now without me touching anything (and without silencing warnings about your own code) or you could call dispatch2 (which works for me in devel/c backend, but hasn't been tested as much as the above f(clCfg) snippet in terms of warning silence).

pb-cdunn commented 5 years ago

It's a good library. You should be proud of it!

And no, I'm not in a rush. This was actually the last of the warnings I've been squashing. The Nim compiler continues to evolve.

There is one other large set of warnings that popped up recently -- possibly related to dispatchMulti() -- relating specifically to thread-safety and occurring within the Nim thread library, but I won't bother you with those unless I discover the actual cause, which could be anything.

c-blake commented 5 years ago

For me, 4 versions of nim (0.19.[246] & devel) all compile test/FullyAutoMulti.nim at verbosity:2 with no warnings except innocuous ones about the stdlib critbits.nim having a ch inside a while shadow a ch outside the while. (critbits.nim does have a bug I haven't reported my fix for yet, but it's not that one!) That warning went away in nim-devel due to the loop being put in a block blockX: to break early.

Anyway, I just pushed a fix for this thread's issue. Let me know if you have any trouble with it. It seems to work in all my tests.

c-blake commented 5 years ago

Actually, it seems the critbits.nim code did not change in any important way and the warning went away due to someone deciding to not issue it. Sometimes Araq says such shadowing is idiomatic Nim, but compilers for other languages often do often warn. And just for the record you can see the actual bug by compiling test/FullyAutoMulti in debug/non-release mode and running test/FullyAutoMulti helpp. The extra p causes an out of bounds array access. I had a patch for that, but lost it. It's a pretty obvious bug, though.