Closed johanneskoester closed 10 months ago
Fixed in the main branch.
That makes sense. So would an example running command look like this?
$ snakemake --jobs 1 --executor googlebatch --googlebatch-region us-central1 --googlebatch-
project llnl-flux --no-shared-fs
Error: If no shared filesystem is assumed, a default storage provider has to be set.
I know you've mentioned this before - there should be some default with s3/minio?
$ snakemake --jobs 1 --executor googlebatch --googlebatch-region us-central1 --googlebatch-project llnl-flux --no-shared-fs --default-storage-provider s3
WorkflowError:
StorageQueryValidationResult: query hello/world.txt is invalid: must start with s3 (s3://...)
File "<string>", line 6, in __init__
File "/home/vanessa/Desktop/Code/snek/snakemake-executor-plugin-googlebatch/example/hello-world/Snakefile", line 3, in <module>
So I tried:
# By convention, the first pseudorule should be called "all"
# We're using the expand() function to create multiple targets
rule all:
input:
expand(
"s3://{greeting}/world.txt",
greeting = ['hello', 'hola'],
),
# First real rule, this is using a wildcard called "greeting"
rule multilingual_hello_world:
output:
"s3://{greeting}/world.txt",
shell:
"""
mkdir -p "{wildcards.greeting}"
sleep 5
echo "{wildcards.greeting}, World!" > {output}
"""
$ snakemake --jobs 1 --executor googlebatch --googlebatch-region us-central1 --googlebatch-project llnl-flux --no-shared-fs --default-storage-provider s3
Building DAG of jobs...
Uploading source archive to storage provider...
WorkflowError:
Failed to store output in storage snakemake-workflow-sources.3d24779cdba7cb1d00d0d15beeaffeb44fea8228e1872249b4707b6541148321.tar.xz
AttributeError: 'StorageObject' object has no attribute 'bucket'
File "/home/vanessa/anaconda3/lib/python3.11/asyncio/runners.py", line 190, in run
File "/home/vanessa/anaconda3/lib/python3.11/asyncio/runners.py", line 118, in run
File "/home/vanessa/anaconda3/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
I've updated my local version to the branch here (and the Python tests work) but I'm having trouble getting a basic Snakefile derived "hello world" working given I don't have control of storage anymore.
@vsoch just that you know: Snakemake now automatically deploys the workflow sources before a job executes if the executor implies that there is no shared FS (https://github.com/snakemake/snakemake-interface-executor-plugins/blob/fc37f38f5723c522e7b3e8854d03645e16f53b91/snakemake_interface_executor_plugins/settings.py#L48) or the user sets --no-shared-fs.
I hope this means that you don't need a helper script anymore. Neither you need any specific code for source collection and deployment in your executor. This is already battle tested in the kubernetes executor plugin.