[TODO] cligen calls initializer twice

timotheecour commented 5 years ago

#[
KEY cligen
cligen calls initializer twice
]#

import cligen

proc getFoo(): auto =
  echo "called getFoo"
  result = 1

proc main(foo = getFoo())=
  discard

dispatch(main)

c-blake commented 5 years ago

Yeah. If you nim c -d:printDispatch you can see why. We need a var fooParamDispatch and var fooParamDefault. We need one for the default value and the other to pass to argParse. I may be able to initialize one var from the other, but I'll have to look into how to code is generated to see how hard that will be.

timotheecour commented 5 years ago

actually i had looked at the generated code and figured this bug would occur, then i made this test case;

btw: when ./main --foo:3 is provided explicitly, should getFoo even get called at all? IMO it shouldn't

c-blake commented 5 years ago

Probably whatever would happen with a regular Nim proc when the caller provides an argument is what should happen for consistency with any API designs. I don't know what that is offhand.

c-blake commented 5 years ago

It looks like Nim makes no call if the caller provides an argument. That is actually kind of weird/unexpected, IMO, but I can see why it might be true and/or convenient either way. I wrote/initially tested all this stuff with literal initializers in mind. So, there may be other issues like this.

c-blake commented 5 years ago

It may be hard/a lot of work remove the requirement of calling it at least once, though. We need the type to call the right argHelp for a given param even when no dispatch ever happens and no argument is ever parsed. The easy way to get that was to declare a variable just like the initializer in the call expression. Maybe there is some convenience API to get it?

c-blake commented 5 years ago

We might be able to just generate a var dummyVar: typeof(initializer RHS) or would it be type()?. If that could work we can migrate the initialization downward and do the bookkeeping to not evaluate unless we need it, but that still won't really achieve "only ever calling it once when its output will really be used" in "some strict factory with persistent side-effects" kind of way. We still want a value to display in the default column of the help table.

There might be a way to address this with some new dispatch alist/table parameter specifying the whole default string for any given parameter, but we would still need the CLI author to provide us that string, or at least a flag to tell us not to induce evaluation for the purposes of even just help and just put a ? in the help column, but a dflVal[param]='?' is as easy/hard as any dflVal[param]=false or something. So, I'd probably just go with the provide a string to block the eval inducement. The added bonus is that the same escape hatch lets CLI authors put in a better spelling of the default value on a per-param basis with no argHelp overload needed.

One might also just mandate for this rare case that (after we arranged for bookkeeping as per above) a user define their own argHelp that checks the ArgcvtParams.parNm and maybe we consult a string they return before getting a default value.

I think the vast majority of cases in practice don't have persistent side effects to such evaluation, even when they're not statically analyzable to be determined at compile-time. So, even if we could do that static analysis (directly or indirectly with static tricks) which probably has a lot of gotchas, I don't think it's great ergonomics to force all the simple use cases to provide strings. I think it's better in this case to document the limitation and provide an escape hatch workaround. I mean, it's not like we can support every proc (no generics), iterator, etc., etc. either. This particular escape hatch also has the added benefit of more easily tailorable "default value formatting".

Re-opening to remind myself of this idea. Let me know if you disagree with my ergonomics/practicality vs. API correspondence purity ideas.

c-blake commented 5 years ago

So, I sat down to do this and realized one unpleasant implication: doing this requires changing the signature for argParse which is a breaking change for anyone with their own types.

There are a few ways we could go. Currently, argParse takes an empty var, the default value, and an input/output ArgvcvtParams. The initializer for the default has to be evaluated to get that value even though we are just about to parse a string into a hopefully valid value and not use the default. We could either switch to a default value string or drop that default value parameter entirely, or both - drop it as a parameter and include it in ArgcvtParams as a string.

Right now, I only use dfl in the argParse(var bool) to make bool with no argument reverse the default which is the only thing that really makes sense (to me) for a true-defaulted bool which relates back to one of your first issues https://github.com/c-blake/cligen/issues/16. It would be pretty easy there to have that check ap.dfl == "true" (& etc.) instead of just dfl, but I wonder if anyone has used this feature in their own types. In general, it makes sense to have behaviors "relative to the default" for argument parsing be at least possible. { Indeed, it might even make sense to squirrel away a copy of the setByParse pointer in ArgcvtParams. Then an argParse for a given type would have maximum context - the static context of default values and the dynamic context of what has been parsed so far. Well, I suppose "maximum" might include "all that is yet to come", but that would require two passes over the input seq[string], but I digress.}

A pre-parsed, typed Nim value is obviously easier to deal with than parsing the default string (which in this idea, recall, would be CLI author-overridable via some new helpDefault[paramNm]) as well as the CLI user-provided input. So, what we have now is a slight conflict between two rare cases - initializer calls that are expensive or have side-effects and the other rare case of convenience of context-sensitive argParse behavior.

If this were all there were to it, I'd lean toward fidelity to what API/Nim-callability would do which argues against relative-to-default argParse convenience. The CLI/argParse author is already parsing less-reliable CLI user input already. So, some CLI author-provided string in the very same proc seems a light extra burden. But what about when the CLI author does not provide a helpDefault[paramNm] string? Then we have no value to put in ap.dflString. So, I guess we evaluate the RHS and just rely on $ being defined for the type in that case. But that's more extra burden for custom type CLI authors, but probably one they're willing to pay or paying already. They need an argHelp anyhow, so $ is just like that.

Anyway, various competing concerns. Before executing on this, I thought I would give you a chance to offer feedback about all this. It kind of has an added bonus of the easy case argParse being slightly simpler to just take var T, ArgcvtParams.

c-blake commented 5 years ago

Another wrinkle in this is that if string becomes the Nim type for "default values" then argHelp would no longer have a typed argument upon which to select the override. We might still be able to call argHelp[theType](), but that could confuse custom type users (or break).

Or we could break-up the functionality of argHelp since we'd be mandating users define (or re-define) $ for custom types. Right now argHelp lets users control the rendering in generated help for A) option keys (e.g. bool have no = even though bool can accept them, but redundant cues can be easier on frustrated CLI users), B) the "CLI type" which is mostly just the Nim type but might be spelled more shell-user-friendly, and C) the default value string which is a recapitulation of $. I imagine the C) part is the main thing CLI authors would want to change. So, we could provide separate hooks/override possibilities for rarely changed A & B and the release notes could say "You don't need to define argHelp anymore, just define $ if Nim doesn't for you or you don't like its version".

Anyway, even providing a workaround to not always evaluate the RHS once per invocation is proving to be a very invasive/breaking change on the whole type extension setup. I have to ask, did you actually need this property for something real or did it just surprise you when debugging?

Like I said above, I feel its need seems rare enough that by default we should not demand helpDefault[paramNm] for every param just on the grounds that CLI author ergonomics beats API fidelity. The higher API fidelity default would make the minimal invocation dispatch(foo, helpDefault = { "every": "1", "darn": "2", "param": "3" }) which definitely puts too big a dent in the cligen sales pitch, IMO.

So, given that the default behavior will be to evaluate RHS once instead of zero just to get a help string (which is needed for both default-relative parsing like bool as well as for --help display for anything), another workaround for this whole issue is to do nothing at all but rather mandate CLI authors in this rare situation write a wrapper proc whose default value is something without side-effects in need of control that can then just call through to the API proc as in

proc api(x = TheNuclearOption()) = discard
import cligen
proc apiWrap(x=0) = api(x)
dispatch(apiWrap, help={"x": "File bankruptcy Chapter X"})

(supposing TheNuclearOption returns int). I mean, presumably people wrapping APIs with persistent side-effects from API default values have to be in a "be very careful" mindset already. It's already an "unsafe-ish API".

c-blake commented 5 years ago

It's hard to justify a breaking re-arrangement of argParse/argHelp to achieve an only partial solution based on a "btw, IMO" with no other feedback. The "twice" aspect of this issue has been solved since the day it was opened. So, I'm closing it. We may need to document this limitation/the workaround somewhere.

c-blake / cligen

[TODO] cligen calls initializer twice #81