pepkit / looper

A job submitter for Portable Encapsulated Projects
http://looper.databio.org
BSD 2-Clause "Simplified" License
20 stars 7 forks source link

CLI args for compute settings and unknown settings #245

Closed nsheff closed 4 years ago

nsheff commented 4 years ago

in the past I would use --compute bulker_slurm to set the compute package.

Now there is a --compute setting, which takes key=value pairs directly, so I can't use python's partial argument completion and have to pass --compute-package bulker_slurm. I like the ability to set compute arguments on the command line, but I miss the short 'compute' argument I could use for selecting the package.

I had a few thoughts:

this is starting to sound like the looper CLI should just provide a passthrough to the divvy CLI, somehow.

One problem is that looper right now passes any unknown commands to the pipeline directly. So, we don't want to have short args like -s because then if your pipeline takes -s, you can't use that looper passthrough feature. Here's a solution:

instead of passing all uncaught args to the pipeline, we introduce a new "pipeline-args" argument. you'd use it like:

looper run x.yaml --dry-run --lump 15 --pipeline-args "-R -s" -s settings.yaml

then it would be no problem to introduce single-char args to looper run. Here, we can use -s for a looper-divvy settings.yaml passthrough; but still use -s to pass to the pipeline. If we did this, then I'd say for the compute settings, the looper CLI should offer the same interface as divvy:

plus,

Then the looper CLI would also offer:

Then I could just use: -p bulker_slurm instead of --compute-package ..., without sacrificing the ability to pass a -p to a pipeline directly.

stolarczyk commented 4 years ago

this all sounds reasonable.

Do you think we should include all these changes in the upcoming release, or maybe changing current --compute to --divvy (keeping the partial arg completion functional) would do?

nsheff commented 4 years ago

Or, we could change compute-package to package to align with divvy

stolarczyk commented 4 years ago

That’s even better. Better describes what it is actually going to set.

stolarczyk commented 4 years ago

should be ready to go, see:

[mstolarczyk@MichalsMBP atac_ebna2](master): looper run -d atac_ebna2.yaml -a "--testing pipeline,args" --dontknow this,arg --limit 1 -p local
Looper version: 0.12.6-dev
Command: run
String appended to every pipeline command: --testing pipeline,args <=====
Unrecognized arguments: --dontknow this,arg <=====
Activating compute package 'local'
## [1 of 18] sample: GSM4467115; pipeline: PEPATAC
2 input files missing, job input size was not calculated accurately
> Not submitted: Missing files: /Users/mstolarczyk/SRR11519128_1.fastq.gz
Writing submission scripts for 1 skipped samples
Writing script to /Users/mstolarczyk/atac_ebna2/submission/PEPATAC_GSM4467115.sub

Looper finished
Samples valid for job generation: 1 of 18
Commands submitted: 0 of 18
Jobs submitted: 0
Dry run. No jobs were actually submitted.

1 unique reasons for submission failure: Missing files

Summary of failures:
Missing files: GSM4467115
[mstolarczyk@MichalsMBP atac_ebna2](master): c /Users/mstolarczyk/atac_ebna2/submission/PEPATAC_GSM4467115.sub
#!/bin/bash

echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`

{
/Users/mstolarczyk/Uczelnia/UVA/code//pepatac/pipelines/pepatac.py --sample-name GSM4467115 --genome hg38 --input /Users/mstolarczyk/SRR11519128_1.fastq.gz --single-or-paired PAIRED -O /Users/mstolarczyk/atac_ebna2/results_pipeline -P 4 -M 12000  --input2 /Users/mstolarczyk/SRR11519128_2.fastq.gz      --prealignments rCRSd               --testing pipeline,args
} | tee /Users/mstolarczyk/atac_ebna2/submission/PEPATAC_GSM4467115.log --ignore-interrupts
stolarczyk commented 4 years ago

raised by @jpsmith5

... --pipeline-args '--argument' does not work


This is a specific case, which did not occur both in my testing and in the smoketest that we have. That's because the --argument string is consumed by argument parser no matter if it is in single, double quotes or without. This is a known quirk/feature of argparse. More detailed explanation: https://stackoverflow.com/a/21894384/11793076

Possible solutions:

Unfortunately creating a custom argument type that will append the space behind the scenes is not possible because the arguments starting with -- are not even associated with --pipeline-args.