DataBiosphere / toil

A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
http://toil.ucsc-cgl.org/.
Apache License 2.0
900 stars 240 forks source link

rna-seq pipeline #4932

Open chubukov opened 6 months ago

chubukov commented 6 months ago

I'm wondering if there is concise documentation somewhere for reproducing the exact analysis of the TCGA/GTEX data in this paper (but with new samples).

https://github.com/BD2KGenomics/toil-rnaseq/tree/master does not seem to be maintained any more, and I was unable to pip install toil-rnaseq with either python2.7 or python3.11.

Thanks.

┆Issue is synchronized with this Jira Story ┆Issue Number: TOIL-1569

adamnovak commented 6 months ago

It looks like toil-rnaseq did not set a maximum Toil version: https://github.com/BD2KGenomics/toil-rnaseq/blob/23045b896a2e08a61284d63c822ccfdcd3dab0a7/version.py#L17-L18 So installing it with pip install will pull in versions of Toil that could be too new to be compatible with it.

It looks like the workflow has a Docker container available, and instructions for using it, so that is probably the easiest way to run the workflow on new samples. It seems like this bundles an appropriate version of Toil. You might need to make sure that the Docker daemon you are using is sufficiently old, since in Docker v26 the ability to pull images in older formats before v2 schema 2 was turned off, while some required images for the pipeline, such as docker.io/jvivian/rsem_postprocess:latest, are in schema 1 format.

How the original workflow was run is in "Supplementary Note 6" in the paper, but the procedure given may no longer work, because published versions of packages installed from the Internet may have changed. I think there may also have been changes made to the AWS API so that Toil 3.12 and/or the unmaintained cgcloud might not be able to talk to it anymore.

chubukov commented 6 months ago

Thanks @adamnovak, I didn't appreciate that I could just run the docker container without the python wrapper. I'll give it a shot.