Oshlack / JAFFA

JAFFA is a multi-step pipeline that takes either raw RNA-Seq reads, or pre-assembled transcripts, then searches for gene fusions
https://github.com/Oshlack/JAFFA/wiki
Other
86 stars 21 forks source link

Updated Dockerfile for easier deployment and consistent builds #105

Closed olliecheng closed 1 month ago

olliecheng commented 1 month ago

Hi,

This PR contains an updated Docker build process which I hope will become a supported and well-maintained deployment method for JAFFA.

It supports GitHub Actions for auto deployment, includes a slimmed down final run stage, and includes a fixed set of binaries and runtimes which mirror the settings recommended in the JAFFA wiki. This PR also includes a patch for the bpipe bug in https://github.com/ssadedin/bpipe/issues/290, which is a randomly occurring issue which can present in both containers and native runs.

Improvements compared to existing approaches:

The main goal is to produce an image which can be easily updated in the future if JAFFA changes, with as minimal change to the Dockerfile as possible.

In my testing, I've observed that the output from the native and the Docker JAFFAL pipelines are identical when testing using the instructions and data provided in the wiki. For direct, assembly, and hybrid modes, I think that the output is unlikely to be identical due to randomness in bowtie2. However, the first 9 columns of JAFFA_results.csv are identical.

You can try out my build of version 2.4 at docker://ghcr.io/olliecheng/jaffa:latest.

olliecheng commented 1 month ago

I am also interested in writing documentation to the JAFFA wiki explaining how to use the Docker image, alongside the regular method. That way, users won't need to read the Dockerfile to understand where JAFFA expects the reference files to be located or where to find the pipeline files.