Open ejduncan opened 8 years ago
Sorry for not replying, it seems as though your issue was missed over a year ago! Did you manage to solve your issue?
I didn't manage to solve this unfortunately and I am just about to do another (quite large) set of analyses. Any help or advice would be greatly appreciated! Thanks.
Are you able to provide s few lines of example input, the command you used and the output so I can help recreate and understand your issue. Thanks
Hi @ejduncan and @Acribbs . I think there were two issue. The length calculation did not take into only exons, but also any other annotations. This was a bug and is now fixed, 'length' is now only counted based on the --exon feature.
Also, "longest-transcript" is unfortunately a bit ambiguous. Longest transcript here is the one with the longest "transcript-length", which might not be the one with the longest genomic span. I have added more options to make this clearer, --filter-method
can now be
longest-transcript-genomic-span
, longest-transcript-transcript-length
, and longest-transcript-exon-count
.
I now get: transcript-length: ocm-RA, GlyS-RB genomic-span: ocm-RB, GlyS-RC Hope this is better.
@AndreasHeger Thanks for the explanation.
@AndreasHeger @ejduncan can this issue be closed?
Ok for me, but would be good to know if it now behaves as expected for @ejduncan
Hi, sorry I haven’t had a chance to try it yet – but will do ASAP.
From: Andreas Heger [mailto:notifications@github.com] Sent: 24 November 2017 13:06 To: CGATOxford/cgat cgat@noreply.github.com Cc: Elizabeth Duncan E.J.Duncan@leeds.ac.uk; Mention mention@noreply.github.com Subject: Re: [CGATOxford/cgat] gtf2gtf - not selecting longest transcript? (#293)
Ok for me, but would be good to know if it now behaves as expected for @ejduncanhttps://github.com/ejduncan
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/CGATOxford/cgat/issues/293#issuecomment-346824657, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AWJMRqqvyomB_6tPnQjpoFI_kszeD-L0ks5s5r8pgaJpZM4KnfHE.
I am wanting to extract the longest transcript for each gene from a gtf file (or gff3 file). I have installed cgat gtf2gtf and have tried using various parameters to do this using Drosophila melanogaster r6.12.gtf. It pulls out a single transcript for each gene, but not necessarily the longest transcript (e.g. ocm-RB is selected, yet it is shorter than ocm-RA and GlyS-RA is selected when it is shorter than GlyS-RB).
I was just wondering if anyone else has had problems like this and could give me some advice on how to solve?
Thanks in advance! Liz