d3b-center / ticket-tracker-OPC

A repo to generate and track tickets for ped OT
2 stars 0 forks source link

Remap GTEx TPM matrix to GENCODE v39 gene symbols #522

Closed jharenza closed 1 year ago

jharenza commented 1 year ago

What data file(s) does this issue pertain to?

GTEx GENCODE v26 TPM file can be downloaded here

What release are you using?

v11

Put your question or report your issue here.

Currently, we have the GTEx RNA-Seq expression data contained within gene-expression-rsem-tpm-collapsed.rds. It was not yet included in the file in v12.

  1. First, we should convert the symbols over to those matching GENCODE v39 as per #521
  2. Should we include gtex with this ped tumor expression matrix in v12 or should we make it a separate file? If we have 3 separate files for tcga, gtex, ped tumors, then we also would want a merge/collapse of all 3 for the api/pedcbio work. cc @migbro @ewafula @afarrel @logstar for input

Who will complete this work?

@migbro and @zhangb1 please coordinate best path for this

migbro commented 1 year ago

I guess a quick question - seems you pull the transcript file, which would require collapsing to gene level, but they also have a gene level file. Why not use that one?

jharenza commented 1 year ago

I guess a quick question - seems you pull the transcript file, which would require collapsing to gene level, but they also have a gene level file. Why not use that one?

Ahh, I didn't see it when I was doing this last night - exhaustion getting the best of me - yes, we should use the gene file

zhangb1 commented 1 year ago

@jharenza sorry , just ask here ,do you have the gene count file for v26 somewhere either? seems need to have Getx counts results too.

migbro commented 1 year ago

@zhangb1 go here: https://gtexportal.org/home/datasets Navigate to v8 mRNAseq data. That's where I got the gene tpm from, others are there too

migbro commented 1 year ago

GTEx count task here: https://cavatica.sbgenomics.com/u/d3b-bixu-ops/open-target-tcga-rnaseq-counts/tasks/d34e5fa8-5de1-4d82-bc6d-023a926cf256/ GTEx TPM task here: https://cavatica.sbgenomics.com/u/d3b-bixu-ops/open-target-tcga-rnaseq-counts/tasks/29d89396-0129-475a-a8c4-502721c3134b/

migbro commented 1 year ago

Closing as tool developed addressed this issue