Open luator opened 5 months ago
To make sure I understand correctly. You're idea is to replace
params = cluster_utils.read_params_from_cmdline()
with something like
params = cluster_utils.parse_args()
cluster_utils.register_with_server(params.server_info)
, so that the user can potentially replace cluster_utils.parse_args()
with their own function?
Basically yes. I think cluster_utils can offer a default commandline interface and parsing, but that it not be mandatory. Instead, we should define how cluster_utils passes the communication and job parameters to the user script, and the user can then build an interface around that if so desired.
A separate issue is how config/parameter files are parsed (which is where smart_settings/omegaconf comes in). Right now, this is entangled with commandline parsing, because the parameter dict that is passed to the user script can also contain smart_settings specific structures. Instead, I think the parameter dict should just be JSON without any special features. Then the choice of config library lies on the user (unclear if cluster utils should offer a default in this direction).
I have not fully thought that through, because I am also not familiar with all the use cases of how people use smart settings.
I would also like to suggest a change in the way the user script accepts parameters.
Right now, the scripts is called with
script.py <comm_params> <parameters>
where both <comm_params>
and <parameters>
are nested json dictionary structures, with <comm_params>
having entries _id
, ip
and port
, and <parameters>
having at least working_dir
and id
, and further user parameters.
Instead, I would suggest to add explicit named parameters for all those parameters, i.e.
--working-dir
--job-id
--cluster-utils-server-ip
--cluster-utils-server-port
and pass the user parameters as optional positional afterwards.
The call to the script would thus like
script.py \
--working-dir <working_dir> \
--job-id <id> \
--cluster-utils-server-ip <ip> \
--cluster-utils-server-port <port> \
<user-arg1-path>=<user-arg1-value> \
<user-arg2-path>=<user-arg2-value>
...
This makes the job interface more explicit and changes the split between "server comm parameters" and "other parameters" into "cluster utils parameters" and "user parameters", which I think is more intuitive. It also removes the need to do JSON parsing on the user side (but requires resolving the <user-arg-paths>
, which are dot-separated paths into dictionary trees -- but this is the case with our optional parameters anyways now).
On the downside, it might make future API changes harder, as we don't have oblique "server_comm" and "parameters" structures anymore, that we can change on our side as we please.
+1 for splitting the server info into separate arguments. I would suggest to combine ip and port to the common "
I'm not sure I like having every setting as a separate argument, though. It would make the script call more readable but the actual parsing of the arguments would get much more complicated, especially if users want to do the parsing on their own. It also wouldn't work well for nested parameters, e.g. cluster_requirements
.
Ah, sorry, I think I misunderstood. You mean everything from the config file should go to the dot-separated user-args, i.e. only the arguments you explicitly listed above should be added?
I would suggest to combine ip and port to the common ":"-notation, though, so we save one argument
+1
Should I directly add this to the ongoing argparse-PR?
If it's not too much trouble, why not
You mean everything from the config file should go to the dot-separated user-args, i.e. only the arguments you explicitly listed above should be added?
Yes (although it is not everything from the config file, just the stuff from the "parameters" section). The ones I explicitly listed are required by cluster utils, everything else is user-specific.
I was wrongly thinking about the whole content of the config file that is passed to grid_search
/hp_optimization
, which I think would be too much to have everything as separate arguments. But in the context of the job script, I think this makes a lot of sense. It would also resolve the issue of potential name collisions which are currently explicitly checked for.
The --job-id
and --cluster-utils-server
arguments will be added in #104. The rest I would tackle in a separate PR.
Ultimately, I would like to refactor the user-facing API such that the user has control over how the commandline is parsed. Parsing settings and communicating with the server is entangled in cluster utils (in
read_params_from_commandline
), because historically people liked to have a package that does both things at the same time, but I don't think that this is necessary or good design. I would like to separate the two things by 1) having something like a functionregister_with_server(comm_info)
that the user can call after manually parsing the commandline, 2) offer an API method that offers pre-setup commandline parsing for convenience, 3) keepread_params_from_commandline
for the time being, but deprecate it.In the redesign, it should also become more clear of how to get rid of smart settings (or make it optional). Ultimately, I'm questioning whether we need to prescribe a settings library on the user side at all.
_Originally posted by @mseitzer in https://github.com/martius-lab/cluster_utils/pull/104#pullrequestreview-2085416577_