rwth-i6 / sisyphus

A Workflow Manager in Python
Mozilla Public License 2.0
45 stars 25 forks source link

store worker wrapper in Job #184

Closed michelwi closed 6 months ago

michelwi commented 8 months ago

In on of our setups we wanted to use different worker wrapper functions for different parts of the setup.

Therefore I propose to store the currently defined gs.worker_wrapper at the creation time of the Job in the Job. Then different parts of the setup can change the worker_wrapper as they go.

christophmluscher commented 8 months ago

It just crossed my mind that the worker_wrapper does not go in the hash creation. So that could potentially mean that if you want to run the same setup with two different images that this would break, right?

critias commented 8 months ago

You could just put the image in the constructor, store it somewhere e.g. job._image and make _sis_worker_wrapper a job method which read the image again from job._image.

Thinking about this, maybe it would be better create a class method e.g. worker_wrapper which by default calls gs.worker_wrapper, but could simply be overwritten in the job definition. This would make an cleaner interface, but works of cause only if self._job is always set when get_worker_call is called. I think this should be the case, but I'm very unsure about that...

michelwi commented 8 months ago

It just crossed my mind that the worker_wrapper does not go in the hash creation.

correct. nothing that is defined in the global_settings does.

So that could potentially mean that if you want to run the same setup with two different images that this would break, right?

I would say "not work" as opposed to "break". You cannot do this in a single sisyphus experiment folder, as the first time you create a Job the worker_wrapper is stored and any further time the Job is "created" the old instance with the old worker wrapper is returned. The same is true for e.g. the environment variables of the Job

Running the same setup multiple times with different worker_wrappers was also never the point of my implementation. I wanted a subset of Jobs in my graph to use different worker wrappers.

You could just put the image in the constructor create a class method [that] could simply be overwritten in the job definition.

that would necessitate to touch all existing Job implementations to add such a constructor or overwrite the default method. Also "the job overriding the definition" does not seem to directly satisfy my needs as I would sometimes like two different instances of the same Job to have different wrappers (imagine a NN training that uses worker_wrapper to execute the Job in a singularity image with TF2, but for one model I do need a different image with TF1)

Now let me disembark on a little rant here, that I - in principle - dislike that global_settings stop being settings that are applied globally. But it is already used this way when different configs set different gs.ALIAS_AND_OUTPUT_SUBDIR or in the nifty environment modifier context manager that we sometimes employ:

    @contextmanager
    def environment(self):
        prev_environment_set = gs.DEFAULT_ENVIRONMENT_SET
        try:
            gs.DEFAULT_ENVIRONMENT_SET = copy.copy(gs.DEFAULT_ENVIRONMENT_SET)
            gs.DEFAULT_ENVIRONMENT_SET.update(self.extra_env_set)
            yield
        finally:
            gs.DEFAULT_ENVIRONMENT_SET = prev_environment_set

And I cannot deny the huge convenience boost of writing

with my_environment.environment():
    some_huge_subgraph()

instead of finding and iterating over all Jobs of the subgraph and manually modifying their _sis_environment.

Therefore the idea would be to enable a similar behavior also for the worker_wrapper.

This would make an cleaner interface

I would love to have a cleaner interface. The nicest interface that I would imagine would involve a function all_following_jobs_will_be_created_with(extra_environment_set, extra_environment_keep, worker_wrapper, **etc) but my knowledge of the black magic that creates sisyphus Jobs is too limited.. maybe it could be attached to the metaclass-instance that creates the Jobs? In the meantime I figured that this would be the simplest implementation and it would be consistent to how the other gs items that affect Jobs behave: whatever is set at (first) creation counts.

critias commented 8 months ago

I misunderstood how you planed to use it. I thought you planed to overwrite job._sis_worker_wrapper afterwards when needed and not change gs.worker_wrapper before creating the job.

I totally agree to your rant and some times really regret to ever create global_settings in the first place. Maybe a local settings object which could have been passed to every job would have been a better idea. Hacks like with my_environment.environment(): can easily break as soon as you use the async workflow. In short: I understand the convenience of this approach, but I can not recommend using it.

Anyway, this ship has probably sailed a while ago and gs is already frequently used differently as planed. Therefore I will not fight against it.

michelwi commented 6 months ago

I totally agree to your rant [...]

Anyway, this ship has probably sailed [...] Therefore I will not fight against it.

So, do I get an approval now? :D