twlab / TEProf2Paper

TEProf2 Pipeline used to find promoters and predict protein sequences from RNA-sequencing data
Other
18 stars 6 forks source link

Test data for verification purposes? #12

Closed WesleyRosales closed 9 months ago

WesleyRosales commented 10 months ago

Hi all, thank you so much for sharing such an extraordinary pipeline. I am trying to set it up on my own environment however I would like to be able to verify it works as expected on a small dataset with expected outputs. Is there such a dataset that was perhaps used during development that we may be able to use as well? Thank you!

nakul2234 commented 10 months ago

Hello,

Thank you for your comments!

We used the cell line RNA-sequencing data from the following publication: https://www.nature.com/articles/nbt.3080. Unfortunately, I do not think we have permission to directly share that cell line data.

If you use reference-guided mode from the TEProf2 pipeline using the reference from the paper that is provided in the README, then you should get the same candidates present as in our Supplementary Table S8, from the publication. Thus any of the cell line RNA-sequencing data would be good to use. DMS 53 is the cell line where we did the most analysis and have data for so may be a good one to use.

-Nakul Shah