Closed nsheff closed 4 years ago
In my testing of looper I'm missing how to use the new adapters on rivanna. I need an example.
divvy init
needs to create some default adapters I guessI put these adapters into my divvy config file:
adapters:
code: looper.command
jobname: looper.jobname
cores: compute.cores
logfile: compute.logfile
time: compute.time
mem: compute.memory
docker_args: compute.docker_args
docker_image: compute.docker_image
singluarity_image: compute.singularity_image
singularity_args: compute.singularity_args
It correctly populated the {CODE} variable, but not none of the others:
#!/bin/bash
#SBATCH --job-name='{JOBNAME}'
#SBATCH --output='{LOGFILE}'
#SBATCH --mem='{MEM}'
#SBATCH --cpus-per-task='{CORES}'
#SBATCH --time='{TIME}'
#SBATCH --partition='standard'
#SBATCH -m block
#SBATCH --ntasks=1
#SBATCH --open-mode=append
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
cmd="/home/ns5bc/code/sra_convert/sra_convert.py --srr /project/shefflab/data/sra/SRR8435075.sra /project/shefflab/data/sra/SRR8435076.sra /project/shefflab/data/sra/SRR8435077.sra /project/shefflab/data/sra/SRR8435078.sra -O /project/shefflab/processed/paqc/results_pipeline --verbosity 4 --logdev"
y=`echo "$cmd" | sed -e 's/^/srun /'`
eval "$y"
it's because it's looking for the exact keys in the template, uppercase
CODE: looper.command
LOGFILE: looper.log_file
JOBNAME: looper.job_name
CORES: compute.cores
TIME: compute.time
MEM: compute.mem
DOCKER_ARGS: compute.docker_args
DOCKER_IMAGE: compute.docker_image
SINGULARITY_IMAGE: compute.singularity_image
SINGULARITY_ARGS: compute.singularity_args
got it!. code worked lowercase...
great, those looper
variables are working for me now. But the compute
namespace is not working yet, is that expected?
I've added an adapter version here: https://github.com/pepkit/divcfg/blob/master/uva_rivanna_adapters.yaml
will later integrate into the main config (should be backwards compatible)
the compute namespace is not working yet, is that expected?
it works for me in looper, hmmm.. maybe we're doing sth differently? How are you testing it?
DIVCFG=/project/shefflab/rivanna_config/divcfg/uva_rivanna_adapters.yaml looper run paqc.yaml --amendments sra_convert -d
cat /project/shefflab/processed/paqc/submission/convert_ATAC-seq_Suspension_rep3.sub
#!/bin/bash
#SBATCH --job-name='convert_ATAC-seq_Suspension_rep3'
#SBATCH --output='/project/shefflab/processed/paqc/submission/convert_ATAC-seq_Suspension_rep3.log'
#SBATCH --mem='{MEM}'
#SBATCH --cpus-per-task='{CORES}'
#SBATCH --time='{TIME}'
#SBATCH --partition='standard'
#SBATCH -m block
#SBATCH --ntasks=1
#SBATCH --open-mode=append
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
cmd="/home/ns5bc/code/sra_convert/sra_convert.py --srr /project/shefflab/data/sra/SRR8435075.sra /project/shefflab/data/sra/SRR8435076.sra /project/shefflab/data/sra/SRR8435077.sra /project/shefflab/data/sra/SRR8435078.sra -O /project/shefflab/processed/paqc/results_pipeline --logdev"
y=`echo "$cmd" | sed -e 's/^/srun /'`
eval "$y"
have you specified size_dependent_variables
as a TSV in the compute section of sra_convert piface?
I didn't make it backwards compatible. Only the TSV way is supported now
have you specified
size_dependent_variables
as a TSV in the compute section of sra_convert piface?
worked for me this way:
[mjs5kd@udc-ba36-36 paqc](master): echo $DIVCFG
/project/shefflab/rivanna_config/divcfg/uva_rivanna_adapters.yaml
[mjs5kd@udc-ba36-36 paqc](master): looper run paqc.yaml --amendments sra_convert -d --limit 1
Command: run (Looper version: 0.12.6-dev)
Using amendments: sra_convert
Finding pipelines for protocol(s): *
Known protocols: *
## [1 of 17] GSM4289908 (*)
Writing script to /project/shefflab/processed/paqc/submission/convert_GSM4289908.sub
Job script (n=1; 0.00 Gb): /project/shefflab/processed/paqc/submission/convert_GSM4289908.sub
Dry run, not submitted
Looper finished
Samples valid for job generation: 1 of 1
Successful samples: 1 of 1
Commands submitted: 1 of 1
Jobs submitted: 1
Dry run. No jobs were actually submitted.
[mjs5kd@udc-ba36-36 paqc](master): c /project/shefflab/processed/paqc/submission/convert_GSM4289908.sub
#!/bin/bash
#SBATCH --job-name='convert_GSM4289908'
#SBATCH --output='/project/shefflab/processed/paqc/submission/convert_GSM4289908.log'
#SBATCH --mem='8000'
#SBATCH --cpus-per-task='1'
#SBATCH --time='00-04:00:00'
#SBATCH --partition='standard'
#SBATCH -m block
#SBATCH --ntasks=1
#SBATCH --open-mode=append
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
cmd="sra_convert.py --srr /project/shefflab/data/sra/SRR10988638.sra "
y=`echo "$cmd" | sed -e 's/^/srun /'`
eval "$y"
[mjs5kd@udc-ba36-36 paqc](master): c ${CODE}/sra_convert/pipeline_interface_convert.yaml
protocol_mapping:
"*": convert
pipelines:
convert:
name: convert
path: sra_convert.py
# required_input_files: SRR_files
arguments:
"--srr": SRR_files
command_template: >
{pipeline.path} --srr {sample.SRR_files}
compute:
bulker_crate: databio/sra_convert
size_dependent_variables: resources.tsv
[mjs5kd@udc-ba36-36 paqc](master): c ${CODE}/sra_convert/resources.tsv
max_file_size cores mem time
NaN 1 8000 00-04:00:00
0.05 2 12000 00-08:00:00
0.5 4 16000 00-12:00:00
1 8 16000 00-24:00:00
10 16 32000 02-00:00:00
perfect -- can you push those changes to sra_convert ?
I got mixed up between the adapter changes and the compute changes :)
nm I got it. works! thanks.
adapters allow you to use divvy with any source of variables.
divvy originally was part of looper. therefore, the default divvy variables (like
{CODE}
, etc) are from looper. removing divvy from looper decoupled the software, but the variables are still tightly coupled. To make it more flexible, we need to remove this coupling. divvy adapters do that.here's a config file with adapters:
adapters are simple variable mappings from one name to another. they can just be straight-up var:var mappings, but they can also include namespaces (on the supply side; divvy variables aren't namespaced).
This system would allow us to include a 'divvy-looper' adapter. this adapter could be modified either for a universal divvy config, or for a particular compute package, which would enable divvy templates to be used with multiple variable sources.
under this system, looper would simply provide to divvy all available namespaces, the same as it does for command templates. the adapter would convert these into the divvy variables. the advantages is now divvy templates are useful beyond looper. it also simplifies what looper has to do: nothing.
divvy should ship with looper adapters, something like the above example.
what do you think @stolarczyk ?