Closed jpcartailler closed 2 years ago
Hi,
Thank you for your interest in the software. It does appear that line 15318 in the GTF got concatenated with the next line, and was partially truncated. That's why the indexing failed. That does appear to be the only line where this occurred. I'm wondering if there was an issue in the original source file you used to create the GTF. Did you download this from UCSC, or was it custom generated?
Thanks.
Thanks for the quick response!
I generated the GTF from what UCSC-generated "rmsk" - http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=1262077703_u7NggORS5ROmC9L2CmzDdKpl0WJG&clade=mammal&org=Mouse&db=0&hgta_group=varRep&hgta_track=rmsk&hgta_table=rmsk&hgta_regionType=genome&position=&hgta_outputType=primaryTable&hgta_outFileName=mmusculus_mm10_rmsk.gz
I'm not seeing line 14744 concatenated or truncated. Here is what I see on line 14744:
Sorry if I mis-understood. Thx!
Hi,
Sorry, line 14744 is the line number after the GTF is sorted by chromosome, start and end (part of the indexing process). The line in the original GTF is 15318. I have edited my previous response. If you are interested in mm10, we do have a GTF for it already.
Thanks.
Ah, I didn't realize it was resorting the GTF, which makes sense now. Thank you for your quick feedback and solutions. I'll definitely check out the pre-built GTF, but just rebuilt ours to make sure we can have a functioning one in case we want to fine-tune what we get out of UCSC's data.
Greetings and thank you for not only releasing this method, but providing help here! While running TEtranscripts, I ran into an error building the TE index.
Am running
TEtranscripts 2.2.1
in a Singularity container, imported from a Docker image I found on Docker Hub.Here is the output tail:
The GTF files I used are as follows (I zipped them up and are publicly shared, as well as provided the
head
of them farther below):For the error,
TE GTF format error! There is no annotation at line 14744.
, line 14744 in the TE GTF file looks like like the rest of them as far as I can tell. Notes on how I built this file are below.I'm not sure why the error is preceded with an entry from somewhere else in the GTF (
MIRb_dup80
, on line 15318).Any advice on how to approach this problem would be appreciated. Thanks!
TE GTF generated by:
TE GTF head:
Gene GTF head: