Closed taigalokhid closed 3 years ago
Hi Elena,
You need to modify the header in the hits
file.
If you run hitstools inspect hits_file
you'll see that the hits file header contains a block of lines beginning with @.***" E.g.:
@GeneIsoforms ENSG00000000003 ENST00000494424 ENST00000373020 ENST00000496771
This tells mmseq
that the aggregated expression estimate for "ENSG00000000003" should be obtained by summing over ENST00000494424, ENST00000373020 and ENST00000496771.
If you modify the @GeneIsoforms block in the header so that you have additional lines each with an arbitrary ID corresponding to a 5'UTR followed by the IDs of all the transcripts sharing that UTR, you should be in business.
Let's assume you have the new @GeneIsoforms block in a file called "new_GeneIsoforms_block". You could run something like this to generate a new hits file:
hitstools header hits_file | grep TranscriptMetaData > new_hits_file_t cat new_GeneIsoforms_block >> new_hits_file_t hitstools header fits_file | grep IdenticalTranscripts >> new_hits_file_t hitstools t hits_file | grep -v @ >> new_hits_file_t hitstools b new_hits_file_t > new_hits_file
After this you may run mmseq
.
best wishes, Ernest
On 6 Nov 2021, at 11:56, taigalokhid @.***> wrote:
Hi Ernest,
I'm trying to aggregate transcripts by some features with mmseq. How to aggregate transcripts by common 5' UTR according to the "Other aggregations (e.g. of transcripts sharing the same UTRs) are also possible." https://github.com/eturro/mmseq https://github.com/eturro/mmseq Best regards, Elena
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/eturro/mmseq/issues/50, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBTMEFIHBEU3IHQQTKGHS3UKVF3TANCNFSM5HPXVTLQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Thanks a lot. I try it
Best regards, Elena
Hi Ernest,
I'm trying to aggregate transcripts by some features with mmseq. How to aggregate transcripts by common 5' UTR according to the "Other aggregations (e.g. of transcripts sharing the same UTRs) are also possible." https://github.com/eturro/mmseq
Best regards, Elena