Closed houruiyan closed 1 year ago
Hi @houruiyan ! Notice how in the transcriptomes, you have the isoforms! (ENSG.....1234.1
), whereas in your data you do not have the isoform. So the strings don't match up.
What I would recommend is to read the mapping tables (e.g. mo_to_ra.txt
) and delete the .1/.2/.3
s at the end of all the gene names, then save the tables. That should fix your problem.
Best, Alec
Thank you very much. Alec!
Maybe it is not correct to just delete .1 .2 .3
? The different isoform has different value in that mapping table. So which isoform's value should I select for the certain gene? Hope to hear you. Thank you!
It's okay to delete .1 .2 .3
as SAMap will just collapse all isoforms into one node for that transcript and combine all the different mapping values.
So if A_GENE1.1
maps to B_GENE1
and A_GENE1.2
maps to B_GENE2
, then A_GENE1
will map to B_GENE1
and B_GENE2
.
Combining the isoforms is correct because that's what the read mapping is effectively doing if the transcriptome/gtf that you used for generating the expression matrix does not delineate between isoforms.
Closing for now! Please reopen if you're still having trouble.
Hello, thank you very much for your kind help.
I meet another question again. Now I can smoothly run the SAMP(). However, the result cannot meet my exception. I always get the 0 gene symbols match between the datasets and the BLAST graph.
The following is my code
Hope to get yoru answer. Thank you very much!