COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
776 stars 164 forks source link

Quant.sf index issue #640

Open lkw159159 opened 3 years ago

lkw159159 commented 3 years ago

Hello I'm a new in bioinformatics.

when I upload my quant.sf files to R for using DESeq2, my file's indexes are weird.

I thought my index should be formed like ENST00000456328.2 or ENSG00000223972.5 but my index name is "ENST00000456328.2|ENSG00000223972.5|OTTHUMG00000000961.2|OTTHUMT00000362751.1|DDX11L1-202|DDX11L1|1657|processed_transcript|"

So I think that's why I wasn't able to upload my quantificated files to R using tximport()

I used the index that I downloaded from gencode "gencode.v37.annotation.gtf.gz" and I used the gene reference "gencode.v37.transcripts.fa"

image image

thank you for helping me

ACastanza commented 3 years ago

Not affiliated with the Salmon team, but since you didn't get an answer here... When building an index with transcriptomes from Gencode, you should pass the flag "--gencode" to the indexer. This allows salmon to split the record names on the | character and gives you the expected "ENST00000456328.2 or ENSG00000223972.5" style names.

lkw159159 commented 3 years ago

I appreciate to your answer. Thanks a lot

Ki-Wook Lee Student Department of Integrative Biotechnology College of Biotechnology & Bioengineering Sungkyunkwan University Biotechnology and Bioengineering Building 2, Rm 62156 2066 Seobu-ro, Jangan-gu, Suwon, Gyeonggi, 16419, Republic of Korea Tel: +82-10-5580-1770 Fax:+82-31-290-7870

-----Original Message----- From: "Anthony S. @.> To: @.>; Cc: @.>; @.>; Sent: 2021-06-09 (수) 05:16:26 (GMT+09:00) Subject: Re: [COMBINE-lab/salmon] Quant.sf index issue (#640)

Not affiliated with the Salmon team, but since you didn't get an answer here... When building an index with transcriptomes from Gencode, you should pass the flag "--gencode" to the indexer. This allows salmon to split the record names on the | character and gives you the expected "ENST00000456328.2 or ENSG00000223972.5" style names. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.