snakemake / snakemake-executor-plugin-kubernetes

A snakemake executor plugin for submission of jobs to Kubernetes
MIT License
2 stars 6 forks source link

unrecognized arguments: --storage-http-allow-redirects when using HTTP storage provider #19

Open jjjermiah opened 5 months ago

jjjermiah commented 5 months ago

It seems like when I use the http storage plugin with the kubernetes executer, it submits the job but with an added argument --storage-http-allow-redirects.

Snakefile:

storage HTTP:
    provider="http"

rule all:
    input:
        "results/summary.txt"

rule get_file:
    input:
        storage.HTTP("https://example-files.online-convert.com/document/txt/example.txt")
    output:
        "results/summary.txt"
    shell:
        "cat {input} > {output}"

profile:

storage-gcs-project: orcestra-388613
default-storage-provider: gcs 
default-storage-prefix: gs://orcestradata/snakemake8_test

executor: kubernetes
jobs: 5

log:

pixi run snake --verbose --workflow-profile default -F
✨ Pixi task (snake in default): snakemake -c1 --verbose --workflow-profile default -F
Using workflow specific profile default for setting default command line arguments.
Path results/summary.txt, converted to .snakemake/storage/gcs/orcestradata/snakemake8_test/results/summary.txt
Path results/summary.txt, converted to .snakemake/storage/gcs/orcestradata/snakemake8_test/results/summary.txt
Building DAG of jobs...
shared_storage_local_copies: False
remote_exec: False
Uploading source archive to storage provider...
Checking status of 0 jobs
Using snakemake/snakemake:v8.10.8 for Kubernetes jobs.
Using shell: /bin/bash
Provided remote nodes: 5
Job stats:
job         count
--------  -------
all             1
get_file        1
total           2

Resources before job selection: {'_cores': 9223372036854775807, '_nodes': 5}
Ready jobs (1)
Select jobs to execute...
Using greedy selector because only single job has to be scheduled.
Selected jobs (1)
Resources after job selection: {'_cores': 9223372036854775806, '_nodes': 4}
Execute 1 jobs...

[Thu Apr 25 13:22:45 2024]
rule get_file:
    output: gs://orcestradata/snakemake8_test/results/summary.txt (send to storage)
    jobid: 1
    reason: Forced execution
    resources: tmpdir=<TBD>

General args: ['--force', '--target-files-omit-workdir-adjustment', '--keep-storage-local-copies', '--max-inventory-time 0', '--nocolor', '--notemp', '--no-hooks', '--nolock', '--ignore-incomplete', '', '--verbose ', '--rerun-triggers params software-env mtime code input', '', '', '', '--conda-frontend mamba', '', '', '', '', '', '--shared-fs-usage none', '', '--wrapper-prefix https://github.com/snakemake/snakemake-wrappers/raw/', '', '', '', '', '', '--latency-wait 5', '--scheduler ilp', '--local-storage-prefix .snakemake/storage', '', '', '', '', '', '--storage-gcs-retries 5', '--storage-http-allow-redirects True', '', '', '--default-storage-prefix gs://orcestradata/snakemake8_test --default-storage-provider gcs', '--default-resources base64//dG1wZGlyPXN5c3RlbV90bXBkaXI=']
Executing job: pip install --target '.snakemake/pip-deployments' snakemake-storage-plugin-gcs && python -m snakemake --deploy-sources gs://orcestradata/snakemake8_test/snakemake-workflow-sources.aa2971cef4d04f37d9a1a2a2ba973d589bf0fb2e4c198081e94539acfe32f4fc.tar.xz aa2971cef4d04f37d9a1a2a2ba973d589bf0fb2e4c198081e94539acfe32f4fc --default-storage-prefix gs://orcestradata/snakemake8_test --default-storage-provider gcs   --storage-gcs-retries 5 --storage-http-allow-redirects True && python -m snakemake --snakefile Snakefile --target-jobs 'get_file:' --allowed-rules 'get_file' --cores 1 --attempt 1 --force-use-threads   --force --target-files-omit-workdir-adjustment --keep-storage-local-copies --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --verbose  --rerun-triggers params software-env mtime code input --conda-frontend mamba --shared-fs-usage none --wrapper-prefix https://github.com/snakemake/snakemake-wrappers/raw/ --latency-wait 5 --scheduler ilp --local-storage-prefix .snakemake/storage --storage-gcs-retries 5 --storage-http-allow-redirects True --default-storage-prefix gs://orcestradata/snakemake8_test --default-storage-provider gcs --default-resources base64//dG1wZGlyPXN5c3RlbV90bXBkaXI= --mode remote
job resources:  {'_cores': 1, '_nodes': 1, 'tmpdir': '<TBD>'}
k8s pod resources: {'cpu': '950m'}
Get status with:
kubectl describe pod snakejob-baa55f88-7989-50a6-9ecf-3c49ed01bedc
kubectl logs snakejob-baa55f88-7989-50a6-9ecf-3c49ed01bedc
Checking status of 1 jobs
Checking status of 1 jobs
[Thu Apr 25 13:23:05 2024]
Error in rule get_file:
    message: For details, please issue:
kubectl describe pod snakejob-baa55f88-7989-50a6-9ecf-3c49ed01bedc
kubectl logs snakejob-baa55f88-7989-50a6-9ecf-3c49ed01bedcFor further error details see the cluster/cloud log and the log files of the involved rule(s).
    jobid: 1
    output: gs://orcestradata/snakemake8_test/results/summary.txt (send to storage)
    log: /var/folders/8t/rwh6rzg93jxfqkb63gt2n4940000gn/T/snakemakex_xytdlb/persistence/auxiliary/kubernetes-logs/snakejob-baa55f88-7989-50a6-9ecf-3c49ed01bedc.log (check log file(s) for error details)
    shell:
        wget https://example-files.online-convert.com/document/txt/example.txt -O results/summary.txt
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    external_jobid: snakejob-baa55f88-7989-50a6-9ecf-3c49ed01bedc

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-04-25T132244.819888.snakemake.log

bold text in the logged command shows where its inserted:

Executing job: pip install --target '.snakemake/pip-deployments' snakemake-storage-plugin-gcs && python -m snakemake --deploy-sources gs://orcestradata/snakemake8_test/snakemake-workflow-sources.aa2971cef4d04f37d9a1a2a2ba973d589bf0fb2e4c198081e94539acfe32f4fc.tar.xz aa2971cef4d04f37d9a1a2a2ba973d589bf0fb2e4c198081e94539acfe32f4fc --default-storage-prefix gs://orcestradata/snakemake8_test --default-storage-provider gcs --storage-gcs-retries 5 --storage-http-allow-redirects True && python -m snakemake --snakefile Snakefile --target-jobs 'get_file:' --allowed-rules 'getfile' --cores 1 --attempt 1 --force-use-threads --force --target-files-omit-workdir-adjustment --keep-storage-local-copies --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --verbose --rerun-triggers params software-env mtime code input --conda-frontend mamba --shared-fs-usage none --wrapper-prefix https://github.com/snakemake/snakemake-wrappers/raw/ --latency-wait 5 --scheduler ilp --local-storage-prefix .snakemake/storage --storage-gcs-retries 5 **--storage-http-allow-redirects_** True --default-storage-prefix gs://orcestradata/snakemake8_test --default-storage-provider gcs --default-resources base64//dG1wZGlyPXN5c3RlbV90bXBkaXI= --mode remote

jjjermiah commented 5 months ago

Looks like it was also triggered here: https://github.com/snakemake/snakemake/issues/2501

jjjermiah commented 5 months ago

I believe this might be because the snakemake-storage-plugin-http plugin is not installed on the node, makes sense since the only default arg for the http plugin is the allow-redirects argument = True and so its passed on automatically.

I think I've possibly? narrowed it down to this piece of code.

https://github.com/snakemake/snakemake/blob/a988cef3d84234ace0b9f40272a887a7b3b3ca2c/snakemake/spawn_jobs.py#L192-L224

  package_name = StoragePluginRegistry().get_plugin_package_name(
      self.workflow.storage_settings.default_storage_provider
  )
  precommand.append(
      f"pip install --target '{common.PIP_DEPLOYMENTS_PATH}' {package_name}"
  )

It looks like this only tells the node to install the default storage provider.

I think the fix would be to add a command to install any of the registered storage providers.

side note: I'm not actually sure if f"{storage_provider_args}" is declared or instantiated anywhere but i dont think thats the issue here