UCSC-Treehouse / pipelines

Makefiles to run dockerized pipelines used in Treehouse on a single sample
Apache License 2.0
3 stars 6 forks source link

canonical transcripts, gene model, genomic coordinates #1

Closed cchng closed 7 years ago

cchng commented 7 years ago

Hi,

I'm processing some expression data from your project. I have a list of gene symbols and my goal is to get their corresponding genomic coordinates. How are gencode transcripts converted to gene symbols in your pipeline? For example, is there a list of canonical transcripts? Are values collapsed/averaged? Any input on your post-processing/consolidate output step seen here would be appreciated!

Thanks, Carolyn

rcurrie commented 7 years ago

Hi Carolyn,

The underlying pipeline is linked from the README:

https://github.com/BD2KGenomics/toil-rnaseq

Look through the primary python pipeline file for details on what's being called and all the processing details:

https://github.com/BD2KGenomics/toil-rnaseq/blob/master/src/toil_rnaseq/rnaseq_cgl_pipeline.py

Rob

cchng commented 7 years ago

Thanks Rob!