aryarm / as_analysis

A complete Snakemake pipeline for detecting allele specific expression in RNA-seq
MIT License
10 stars 9 forks source link

create a small, test dataset and a testing suite #73

Open aryarm opened 3 years ago

aryarm commented 3 years ago

So that we can be sure that future changes to the code don't actually change the results of the pipeline We should upload both to GitHub and use CI

aryarm commented 3 years ago

I tried to start doing this with a few chromosomes from the ATAC-seq rat dataset.

But after talking to Graham, we decided that it would be a better idea to use the GM12878 dataset for this rather than the rat dataset. But we'll probably just want to use a single chromosome from GM12878, so that the test dataset finishes running quickly.

Ideally, the test dataset would include both ATAC and RNA seq data and it would be able to run through the entire pipeline really quickly (~10 minutes or so). So this may even necessitate creating a reference genome from a small, subset of a chromosome.

aryarm commented 3 years ago

Just a note that there's a Snakemake feature that actually makes this easy to do! https://snakemake.readthedocs.io/en/stable/snakefiles/testing.html

Unfortunately, it is buggy for files marked as pipe and also has some weird directory issues.