Closed mcarans closed 1 year ago
I understand the problem (I think), but this seems rather tricky to fix (even if you override the formatting functions at https://docs.python.org/3/library/argparse.html#printing-help, you'd need to basically reimplement all the logic yourself, which seems quite a mouthful to do?). Or can you think of anything?
Could bind_known
return information on the arguments it has managed to bind (say parsed_args
or even help_string
) which could be passed to run
to add to the help output?
Alternatively, could run
accept an additional parameter, something like bind_known=projectmainfn
and then produce the combined help output?
In your suggested APIs, how do you propose to implement the help string generation?
I created a new function based on the 2 functions programmatically using makefun. The new function has the signature and docstring of the 2 functions and is used purely to check the binding. The only thing is I had to use the internal function defopt._parse_docstring
:
create_config_sig = signature(_create_configuration)
create_config_params = list(create_config_sig.parameters.values())
main_sig = signature(projectmainfn)
main_params = list(main_sig.parameters.values())
main_params.extend(create_config_params)
main_sig = main_sig.replace(parameters=main_params)
create_config_doc = getdoc(_create_configuration)
parsed_main_doc = defopt._parse_docstring(getdoc(projectmainfn))
main_doc = [f"{parsed_main_doc.first_line}\n\nArgs:"]
for param_name, param_info in parsed_main_doc.params.items():
main_doc.append(
f"\n {param_name} ({param_info.type}): {param_info.text}"
)
args_index = create_config_doc.index("Args:")
args_doc = create_config_doc[args_index + 5 :]
main_doc.append(args_doc)
main_doc = "".join(main_doc)
@with_signature(main_sig, func_name=projectmainfn.__name__)
def gen_func(*args, **kwargs):
"""docstring"""
return args, kwargs
gen_func.__doc__ = main_doc
argv = sys.argv[1:]
for key in kwargs:
name = f"--{key.replace('_', '-')}"
if name not in argv:
argv.append(name)
argv.append(kwargs[key])
defopt.bind(gen_func, argv=argv, cli_options="all")
func, argv = defopt.bind_known(projectmainfn, argv=argv, cli_options="all")
site_url = defopt.run(_create_configuration, argv=argv, cli_options="all")
func()
The above also allows addition of arguments passed to the function as **kwargs
to those from the command line sys.argv
.
I'm not really sure I want to implement something that complicated directly into defopt (e.g. I don't want to have to decide how to combine the first_lines of the two functions' docstrings, or all kinds of similar design questions), but probably a good idea would be to also expose the parsed function docstring in the enhanced callable returned by defopt.signature (e.g. as a .doc property on a Signature subclass, as I did for Parameter); this should allow you to do what you did but without going through private API. Thoughts?
Do you mean that these lines
main_sig = signature(projectmainfn)
parsed_main_doc = defopt._parse_docstring(getdoc(projectmainfn))
would become something like the lines below?
main_sig = defopt.signature(projectmainfn)
parsed_main_doc = main_sig.doc
If so, I think that would be helpful.
Yes. I probably won't be able to work on this in the coming few days, but will keep that on my todo.
It would be useful to also be able to call _parse_docstring
directly (eg. as parse_docstring
) although I understand if you'd prefer to only make the dosctring available through the signature
call.
I implemented and pushed defopt.signature(func).doc
as discussed above; let me know if that's good enough for now.
Thanks for this. I had a look at the code and it is the _parse_docstring
line that I need to be able to call directly as I use the .first_line
and .params
it returns. Would it be possible to make _parse_docstring
public?
Here is what the current code looks like - it has been through some changes to fix various problems with the solution I had originally posted above:
parsed_main_doc = defopt._parse_docstring(getdoc(projectmainfn))
main_doc = [f"{parsed_main_doc.first_line}\n\nArgs:"]
no_main_params = len(parsed_main_doc.params)
for param_name, param_info in parsed_main_doc.params.items():
main_doc.append(
f"\n {param_name} ({param_info.type}): {param_info.text}"
)
create_config_doc = getdoc(Configuration.create)
kwargs_index = create_config_doc.index("**kwargs")
kwargs_index = create_config_doc.index("\n", kwargs_index)
args_doc = create_config_doc[kwargs_index:]
main_doc.append(args_doc)
main_doc = "".join(main_doc)
main_sig = defopt.signature(projectmainfn)
param_names = list()
for param in main_sig.parameters.values():
param_names.append(str(param))
parsed_main_doc = defopt._parse_docstring(main_doc)
main_doc = [f"{parsed_main_doc.first_line}\n\nArgs:"]
count = 0
for param_name, param_info in parsed_main_doc.params.items():
param_type = param_info.type
if param_type == "dict":
continue
if count < no_main_params:
count += 1
else:
if param_type == "str":
param_type = "Optional[str]"
default = None
elif param_type == "bool":
default = False
else:
raise ValueError(
"Configuration.create has new parameter with unknown type!"
)
param_names.append(f"{param_name}: {param_type} = {default}")
main_doc.append(
f"\n {param_name} ({param_type}): {param_info.text}"
)
main_doc = "".join(main_doc)
projectmainname = projectmainfn.__name__
main_sig = f"{projectmainname}({','.join(param_names)})"
argv = sys.argv[1:]
for key in kwargs:
name = f"--{key.replace('_', '-')}"
if name not in argv:
argv.append(name)
argv.append(kwargs[key])
@with_signature(main_sig)
def gen_func(*args, **kwargs):
"""docstring"""
logger.info(f"> Using HDX Python API Library {__version__}")
gen_func.__doc__ = main_doc
configuration_create = defopt.bind(gen_func, argv=argv, cli_options="all")
main_func, _ = defopt.bind_known(
projectmainfn, argv=argv, cli_options="all"
)
...
main_func()
I am a bit reluctant to turn first_line
into public API because I think I would actually rather get rid of it even internally and simply replace it by text.split("\n\n")[0]
, which should be equivalent.
As for accessing params
, doesn't signature(...).params
provide what you need? You can get e.g. signature(...).params[...].doc
and signature(...).params[...].annotation
, the only difference being that the latter is a realy python type object rather than a str, but in your use cases this should work just as well? (I didn't actually check by adjusting your entire code to what I propose, to be fair.)
The issue is that I need to parse each possible kwarg from a docstring that looks like this:
class Configuration(UserDict):
"""Configuration for HDX
Args:
**kwargs: See below
user_agent (str): User agent string. HDXPythonLibrary/X.X.X- is prefixed. Must be supplied if remoteckan is not.
user_agent_config_yaml (str): Path to YAML user agent configuration. Ignored if user_agent supplied. Defaults to ~/.useragent.yml.
...
def __init__(self, **kwargs: Any) -> None:
super().__init__()
From the above I need to parse user_agent
and user_agent_config_yaml
(and the long list of possible arguments that are allowed as kwargs that follows). What I do is add everything below **kwargs
as docstring parameters after my other functions docstring parameters, then use _parse_docstring
to parse the resulting combined docstring. This wouldn't work with signature because the **kwargs
are treated as a single parameter and signature.doc ignores all the stuff below the **kwargs
line.
I'm fine with losing first_line
but can't think of a way to avoid using _parse_docstring
(which is a really useful function) to parse my constructed docstring to get the parameters.
OK, I see, this seems reasonable. (Sorry for being difficult, but I try to be conservative with adding new APIs.)
One thing I am not so happy with in _parse_docstring is the custom (_Doc) return type, which I don't really want to document. However I think I could actually get rid of it and just return another defopt.Signature instance (corresponding to whatever is documented). In fact this suggests expanding defopt.signature() to support also taking a docstring (instead of a callable) as argument: when a callable without docstring is passed, it computes the signature from the callable; when a standalone docstring is passed, it computes the signature from the docstring; when a callable with a docstring is passed, it computes both, which have to be compatible.
Thoughts?
That sounds like a good idea. For each parameter, I need to be able to get the parameter name, text description and type. For the parameter type, I need to be able to to determine if it is a "dict", "str" or "bool" for the time being.
OK, can you check how HEAD looks to you? Note the API changes noted in the changelog.
That's great! It's a nice addition to defopt. It looks like it is working. The code now looks like this:
parsed_main_doc = defopt.signature(getdoc(projectmainfn))
main_doc = [f"{parsed_main_doc.doc}Args:"]
params = parsed_main_doc.parameters
no_main_params = len(params)
for param_name, param_info in params.items():
main_doc.append(
f"\n {param_name} ({param_info.annotation}): {param_info.doc}"
)
create_config_doc = getdoc(Configuration.create)
kwargs_index = create_config_doc.index("**kwargs")
kwargs_index = create_config_doc.index("\n", kwargs_index)
args_doc = create_config_doc[kwargs_index:]
main_doc.append(args_doc)
main_doc = "".join(main_doc)
main_sig = defopt.signature(projectmainfn)
param_names = list()
for param in main_sig.parameters.values():
param_names.append(str(param))
parsed_main_doc = defopt.signature(main_doc)
main_doc = [f"{parsed_main_doc.doc}Args:"]
count = 0
for param_name, param_info in parsed_main_doc.parameters.items():
param_type = param_info.annotation
if param_type == "dict":
continue
if count < no_main_params:
count += 1
else:
if param_type == "str":
param_type = "Optional[str]"
default = None
elif param_type == "bool":
default = False
else:
raise ValueError(
"Configuration.create has new parameter with unknown type!"
)
param_names.append(f"{param_name}: {param_type} = {default}")
main_doc.append(
f"\n {param_name} ({param_type}): {param_info.doc}"
)
main_doc = "".join(main_doc)
projectmainname = projectmainfn.__name__
main_sig = f"{projectmainname}({','.join(param_names)})"
argv = sys.argv[1:]
for key in kwargs:
name = f"--{key.replace('_', '-')}"
if name not in argv:
argv.append(name)
argv.append(kwargs[key])
@with_signature(main_sig)
def gen_func(*args, **kwargs):
"""docstring"""
site_url = Configuration._create(*args, **kwargs)
logger.info("--------------------------------------------------")
logger.info(f"> Using HDX Python API Library {__version__}")
logger.info(f"> HDX Site: {site_url}")
gen_func.__doc__ = main_doc
configuration_create = defopt.bind(gen_func, argv=argv, cli_options="all")
main_func, _ = defopt.bind_known(
projectmainfn, argv=argv, cli_options="all"
)
...
main_func()
OK, let's close this then. Thanks for the API suggestions and discussion.
I previously wrote about consuming command line arguments in 2 function calls (https://github.com/anntzer/defopt/issues/107) and you kindly added functionality for that.
I have noticed one issue. When a wrong command line parameter is added and the help text is displayed, it only shows the parameters from the function given in defopt.run. It needs to show the combination of the parameters from
defopt.bind_known
anddefopt.run
eg. fromprojectmainfn
and_create_configuration
in the example below.How could this be achieved?