Closed albertz closed 8 months ago
Another aspect:
Sometimes I want to edit those env vars locally in a job. E.g. a typical example is an env var like PYTORCH_CUDA_ALLOC_CONF
where I sometimes need to play around with after it crashed with OOM, and then resubmit the job with a slightly changed setting.
This should be simple. Currently, as I understand, this is stored inside the pickled Job
object, in job.save
, so this is not simple. (Can I actually delete the job.save
and it would simply recreate it?)
So, maybe this means, using job._sis_environment
for this is not a good idea, and we should have a separate pure text file which is easily editable?
@michelwi @curufinwe @critias comments, suggestions?
Sorry, that I missed to respond to this.
job.save
is updated if the job is submitted and the manager was restarted. So using Job._sis_environment
should work.
Adding something like putenv
is a good idea, but given the other methods are named like set_rqmt
, set_keep_value
, or set_defaults
I think set_env
would be a better name.
I forgot to respond about the os.environ[k] = string.Template(v).substitute(orig_env)
issue.
Side note for everyone that doesn't know what this line does: It basically emulates the bash $
variable replacement syntax. Meaning something like {"PATH": "foobar:$PATH"}
works like export PATH=foobar:$PATH"
in bash.
I get your point that this might be unexpected, but breaking this now could cause setups to fail. I doubt it's worth changing it, but documenting it better would be a good idea.
One reason I went with the startup hook is that it is more flexible and allows maybe to have other logic in there, e.g. easily to do sth like string.Template.substitute
, but maybe other other things, and the user can easily edit it, without the need to restart the Sisyphus manager.
But maybe that flexibility and the easy possibility to edit it are not needed here, so better use the more simple and specialized solution? I updated the PR accordingly.
breaking [
string.Template(v).substitute
usage] now could cause setups to fail
I'm not breaking it. My current solution is completely orthogonal to the existing (optional) _sis_environment
, where you still have that string.Template.substitute
.
I did not fully went with the solution that I outlined here (as there was also no feedback) because I thought it would make the _sis_environment
code more complex to cover what I actually want. The code in the PR is much simpler than what I initially suggested here.
Is there anything you can't do due to string.Template.substitute
? You should be able to escape $
with $$
.
I think that introducing a second way to update the environment which behave differently would causes even more confusion.
I just don't want the potential source for errors because I maybe forgot to escape a $
. I also will definitely have forgotten that we even have such logic when I edit such file again in 6 months. I sometimes also programatically get some env vars from other external sources, so then I would need extra complexity to escape $
automatically, making it way too complicated.
How about adding a flag to set_env
and automatically escapes the given string if it's set?
Yea that would be another option, but I think all this is still more complex than my current solution (added 11 lines of code), and maybe still more prone to errors (you have extra logic for escaping there, can you be 100% sure that it's always correct?), and also maybe confusing (once I forget again about this substitute
logic, or other people who don't know about this, will wonder why you need to escape anything here).
But I can live with that, if you prefer it this way.
I really don't like having two structures which do basically the same thing. I can't remembering using the string.Template.substitute
feature (@curufinwe implemented it if I recall correctly), but I think changing it now would cause more harm than good.
In the end I don't mind if you use set_vars_verbatim
or value.replace('$', '$$')
. I just think the later requires less code changes. Overall this logic is pretty much like you proposed in your first post.
And yes, the doc string should mention why it can be necessary to escape the input (or set it to verbatim) to avoid that people wonder why they might want to do it.
I went with the set_vars_verbatim
because I want to avoid any potential further source for errors (the escaping logic here).
I added a new separate function set_verbatim
to set it. This is the method used by Job.set_env
, because I really think when this is used programmatically by the recipes, any user would expect that the env var is set verbatim to the string which is passed to set_env
.
I really don't like having two structures which do basically the same thing.
I mildly agree. How about
def set(self, var, value=None, verbatim=False):
and then if verbatim is set, the necessary substitutions are done automatically?
Or the other way round: we get rid of the template, but if verbatim=False
there is some other code to recover the behavior.
How about...
What about the current implementation in the PR? Or you say you prefer a single method? The thing is, I really don't want to have verbatim=False
as default, as I think this is unexpected. However, I also don't want to break anything here.
What about the current implementation in the PR? Or you say you prefer a single method?
I have a slight preference for single method, but I can also sleep well with the current implementation.
I need a possibility to set specific env vars for one specific job.
If you use
CLEANUP_ENVIRONMENT = True
, then you can do sth like:However:
_sis_environment
,CLEANUP_ENVIRONMENT = True
,os.environ[k] = string.Template(v).substitute(orig_env)
, which is maybe unexpected to the user, who would more likely just expectos.environ[k] = v
, and this can lead to weird behavior in rare cases,so this is not a good solution.
(But I still have seen exactly such code in the wild, e.g. in i6_experiments in
compile_sctk
. Also see https://github.com/rwth-i6/i6_experiments/issues/192.)I want a generic solution with public API, which also work when
CLEANUP_ENVIRONMENT = False
, and also which does not usestring.Template.substitute
(or only optionally, via an explicit flag).Any suggestions how to implement this?
One suggestion:
job._sis_environment
in any case, no matter ifCLEANUP_ENVIRONMENT
is enabled or not.EnvironmentModifier
is extended, and has an optioncleanup_environment: bool
, where we passgs.CLEANUP_ENVIRONMENT
.DEFAULT_ENVIRONMENT_SET
andDEFAULT_ENVIRONMENT_KEEP
are only set whenCLEANUP_ENVIRONMENT
. Otherwise, this would lead to different behavior than before.EnvironmentModifier.set
is further extended by the optionverbatim: bool
, which when enabled assigns toset_vars_verbatim
, where we doos.environ.update(self.set_vars_verbatim)
then (so nostring.Template.substitute
).Job
gets a new methodputenv
(or so; this name corresponds toos.putenv
), which just does:Ok? Or other suggestions?
Note, those env vars would not influence the hash. This is also the behavior I want for my current use case.