Open chubukov opened 6 months ago
It looks like toil-rnaseq
did not set a maximum Toil version:
https://github.com/BD2KGenomics/toil-rnaseq/blob/23045b896a2e08a61284d63c822ccfdcd3dab0a7/version.py#L17-L18
So installing it with pip install
will pull in versions of Toil that could be too new to be compatible with it.
It looks like the workflow has a Docker container available, and instructions for using it, so that is probably the easiest way to run the workflow on new samples. It seems like this bundles an appropriate version of Toil. You might need to make sure that the Docker daemon you are using is sufficiently old, since in Docker v26 the ability to pull images in older formats before v2 schema 2 was turned off, while some required images for the pipeline, such as docker.io/jvivian/rsem_postprocess:latest
, are in schema 1 format.
How the original workflow was run is in "Supplementary Note 6" in the paper, but the procedure given may no longer work, because published versions of packages installed from the Internet may have changed. I think there may also have been changes made to the AWS API so that Toil 3.12 and/or the unmaintained cgcloud
might not be able to talk to it anymore.
Thanks @adamnovak, I didn't appreciate that I could just run the docker container without the python wrapper. I'll give it a shot.
I'm wondering if there is concise documentation somewhere for reproducing the exact analysis of the TCGA/GTEX data in this paper (but with new samples).
https://github.com/BD2KGenomics/toil-rnaseq/tree/master does not seem to be maintained any more, and I was unable to
pip install toil-rnaseq
with either python2.7 or python3.11.Thanks.
┆Issue is synchronized with this Jira Story ┆Issue Number: TOIL-1569