jfarek / xatlas

BSD 3-Clause "New" or "Revised" License
29 stars 3 forks source link

Docker and DNAnexus applet #6

Closed ACEnglish closed 2 years ago

ACEnglish commented 2 years ago

I've built a Dockerfile and a DNAnexus applet for xAtlas. Below are the notes on how to use these files for the documentation. Note that the DNAnexus applet pulls from docker.io and is pointing to acenglish/xatlas. I believe this should instead point to a jfarek/xatlas repo and have :tag versioning if this is to be properly maintained.

Documentation:

Docker

A Dockerfile exists to build an image of xAtlas. To make a Docker image, clone the repository and run

docker build -t xatlas .

You can then run xatlas through docker using

docker run -v `pwd`:/data -it xatlas

Where pwd can be whatever directory you'd like to mount in the docker to the path /data/, which is the working directory for the xatlas run. You can provide parameters directly to the entry point.

docker run -v `pwd`:/data -it xatlas -r reference.fa -P -t $(nproc) -i ${in_name} -s ${sample_name} -p ${prefix}

DNAnexus applet

We provide a DNAnexus applet which wraps all the parameters and will run the docker image. To add the applet to your DNAnexus project, clone the repository and run the command:

dx build xatlas_nexus/

You can then run the applet via the GUI or dx-toolkit command line:

dx run xatlas -iin=sequencing/HG00096.bam \
    -iin_index=sequencing/HG00096.bam.bai \
    -iref=GRCh38_1kg_mainchrs.fa \
    -iprefix=HG00096.xatlas \
    -isample_name=HG00096 \
    -icatch_fail=false -icapture_bed=test.bed

Note that the files (e.g. sequencing/HG00096.bam are located inside the project.

Sometimes xAtlas may exit non-zero, but still produce a (valid enough) VCF result. In order to salvage compute costs for semi-successful runs, there is a parameter named catch_fail which when set to false will catch and ignore non-zero exits by xAtlas and still attempt to upload output VCFs. Note that we consider this functionality unsafe since it may make failed runs harder to identify. Therefore, we don't recommend this option is used in a production capacity.

jfarek commented 2 years ago

Thanks Adam, this looks great! Meant to get to this sooner.