snakemake / snakemake-executor-plugin-googlebatch

Snakemake executor plugin for Google Batch (under development)
MIT License
3 stars 5 forks source link

feat: preemption #25

Closed vsoch closed 6 months ago

vsoch commented 6 months ago

This is a WIP because there is a bug #24 with what looks to be the google storage plugin installed. Ping @johanneskoester I need to get #24 fixed before can proceed with more features here. Happy New Year!

vsoch commented 6 months ago

I used the exact logic from the previous snakemake life sciences module, so if that is the case, it wasn't updated to support this design. Can you point me to a plugin that is using preemtible correctly?

Also note that I'm blocked from working on this not because of any issue with that, but because the snakemake command is telling me I'm missing a family of storage "gs" args.

johanneskoester commented 6 months ago

There isn't any yet. But in principle it boils down to check whether the job is supposed to be preemptible with self.workflow.remote_execution_settings.preemptible_rules.is_preemptible(job.rule.name) and then pass this info to google batch. Further, the setting self.workflow.remote_execution_settings.preemptible_retries defines the number of retries.

vsoch commented 6 months ago

@johanneskoester when I remove all my custom logic (and allow snakemake to install plugins it needs) I reproduce the error - it is looking for both gcs and gs. This is installing snakemake from the main branch. I can add the install of the gcs plugin, but then I'll be where I was before, looking for the gs plugin. image

vsoch commented 6 months ago

ah! Just found it in an environment locally - will test removing here and seeing if that somehow (magically) removes it from the remote. That doesn't make sense, but if snakemake is getting the plugins from my local call, it actually would! Will report back.

vsoch commented 6 months ago

Looks like there is a bug with preemptible_rules in cli.py:

 snakemake --jobs 1 --executor googlebatch --googlebatch-region us-central1 --googlebatch-project llnl-flux --default-storage-provider s3 --default-storage-prefix s3://snakemake-testing-llnl --preemptible-rules hello
Traceback (most recent call last):
  File "/home/vanessa/Desktop/Code/snek/snakemake-executor-plugin-googlebatch/env/lib/python3.11/site-packages/snakemake/cli.py", line 1906, in args_to_api
    if not preemptible_rules:
           ^^^^^^^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'preemptible_rules' where it is not associated with a value

Fix here: https://github.com/snakemake/snakemake/pull/2616

vsoch commented 6 months ago

Also with the fix for the gs, the jobs are (finally) green! I won't show you how many red / failed there are, let along that it takes 7 minutes per one step run for a hello world... :grimacing: image