TheJacksonLaboratory / splicing-pipelines-nf

Repository for the Anczukow-Lab splicing pipeline
14 stars 10 forks source link

Add task stat parameter to rMATS #232

Open angarb opened 3 years ago

angarb commented 3 years ago

Problem

A new parameter --task stat has been added to rMATS. This will allow us to do statistics separately from calculating the PSI values.

Solution

Sangram checked the option --task for upgrading the rmats options, docs - https://github.com/Xinglab/rmats-turbo/blob/v4.1.1/README.md#running-prep-and-post-separately

Seems a bit complex, this need a bit of though on how separate both prep and stats into two different processes

Implementation

We need to brainstorm. The best option might be to design a prep parameter and run the stats parameter with the second half of the pipeline (or rMATS alone).

Vlad-Dembrovskyi commented 3 years ago

Suggestion: add documentation in the repo docs on how to run rMATS stats standalone from the pipeline, for example by using the pipeline's container, or an installed tool. Provide examples how to do that on the pipeline's output + additional external samples.

angarb commented 2 years ago

@mkostich has worked on this. We must integrate it into the pipeline:

Notes: "I forked your repo and edited the fork. That way I didn't have to worry about permissions issues. We can merge the fork back into your repo when you are happy with it.

Fork is here: https://github.com/mkostich/splicing-pipelines-nf Branch with changes is 'mitch1' here: https://github.com/mkostich/splicing-pipelines-nf/tree/mitch1 I didn't edit the docs or the wiring diagram. Want to make sure it works for you first. Test data seemed to work on Sumner, but we are still fiddling with GCP. Also, needs testing with single-ended short-read data. Two-stage rMATS is only implemented for when you supply the rmats_pairs (.txt) file. We got pulled in other directions, but hope to get back to the GCP aspect next week.

install and get right branch: git clone https://github.com/mkostich/splicing-pipelines-nf.git git checkout mitch1

edit line 11 in main.pbs to match your install location: project_dir=/path/to/splicing-pipelines-nf

copy 'my.cfg' (equivalent of NF_splicing_pipeline.config, just more convenient name) to working directory (where you launch nextflow), edit it there (make sure to check assembly_name, readlength and minlen; ensure minlen < readlength to allow for adapter removal), and then launch the pipeline:

sbatch /path/to/splicing-pipelines-nf/main.pbs my.cfg

angarb commented 2 years ago

I created a pull request for the changes hereL #327