Open santataRU opened 5 days ago
I just found the cause of the problem. I used a sorted hg38 Ensembl GTF file for the IGV browser instead of the original unsorted GTF file. In the sorted GTF file, all entries for a transcript are NOT in a continuous block, causing problems in index building.
After I ran whippet-index with the original unsorted GTF, I got a warning message: "Using low quality Transcript Support Levels (TSL 3+) in your GTF file is not recommended!" and "If you would like Whippet to ignore these when building its index, use the --suppress-low-tsl
option!"
Is it better to use the --suppress-low-tsl
option?
Thanks,
Xiao
PS: warning message with unsorted hg38 GTF file.
(base) xiaolei@Xiaos-Laptop bin %julia ./whippet-index.jl --fasta /Users/xiaolei/Whippet.jl/anno/hg38.fa.gz --gtf /Users/xiaolei/Whippet.jl/anno/gencode.v45.annotation.gtf.gz
Whippet v1.6.2 loading...
Activating environment at ~/Whippet.jl/Project.toml
5.724591 seconds.
Loading GTF file: /Users/xiaolei/Whippet.jl/anno/gencode.v45.annotation.gtf.gz
┌ Warning: Using low quality Transcript Support Levels (TSL 3+) in your GTF file is not recommended!
│ For more information on TSL, see: http://www.ensembl.org/Help/Glossary?id=492
│
│ If you would like Whippet to ignore these when building its index, use --suppress-low-tsl
option!
│
└ @ Whippet ~/Whippet.jl/src/refset.jl:159
Loaded 643514 annotated splice-sites from GTF file..
122.786011 seconds (209.44 M allocations: 20.170 GiB, 2.98% gc time)
Indexing transcriptome...
Decompressing and Indexing /Users/xiaolei/Whippet.jl/anno/hg38.fa.gz...
Building Splice Graphs for chr1..
8.388163 seconds (44.79 M allocations: 22.590 GiB, 5.43% gc time)
Dear All,
I encountered an error while running whippet-index.jl. The error appears to be due to my GTF file not being in the 2.2 format. Could you please let me know where to download the correct GTF format (GTF2.2) for hg38? The link provided on this website does not work. Is there an updated link available?
Thank you,
Xiao
PS: error messages: (base) xiaolei@Xiaos-Laptop bin %
julia ./whippet-index.jl --fasta /Users/xiaolei/Whippet.jl/anno/hg38.fa.gz --gtf /Users/xiaolei/Whippet.jl/anno/gencode.v45.annotation_sorted.gtf.gz
Whippet v1.6.2 loading... Activating environment at ~/Whippet.jl/Project.toml 5.640663 seconds. Loading GTF file: /Users/xiaolei/Whippet.jl/anno/gencode.v45.annotation_sorted.gtf.gz┌ Warning: Using low quality Transcript Support Levels (TSL 3+) in your GTF file is not recommended! │ For more information on TSL, see: http://www.ensembl.org/Help/Glossary?id=492 │ │ If you would like Whippet to ignore these when building its index, use
--suppress-low-tsl
option! │ └ @ Whippet ~/Whippet.jl/src/refset.jl:159 ERROR: LoadError: ERROR: GTF file is not in valid GTF2.2 format!ERROR: Annotation entries for 'transcript_id' ENST00000430923.7 has already been fully processed and closed. HINT: All GTF lines with the same 'transcript_id' must be adjacent in the GTF file and referring to the same transcript and gene! Stacktrace: [1] error(s::String) @ Base ./error.jl:33 [2] load_gtf(fh::BufferedStreams.BufferedInputStream{Libz.Source{:inflate, BufferedStreams.BufferedInputStream{IOStream}}}; txbool::Bool, suppress::Bool, usebam::Bool, bamreader::Nullable{XAM.BAM.Reader}, bamreads::Int64, bamoneknown::Bool) @ Whippet ~/Whippet.jl/src/refset.jl:165 [3] macro expansion @ ~/Whippet.jl/src/timer.jl:5 [inlined] [4] main() @ Main ~/Whippet.jl/bin/whippet-index.jl:91 [5] top-level scope @ ~/Whippet.jl/src/timer.jl:5 in expression starting at /Users/xiaolei/Whippet.jl/bin/whippet-index.jl:108
Dear All,
I encountered an error while running whippet-index.jl. The error appears to be due to my GTF file not being in the 2.2 format. Could you please let me know where to download the correct GTF format (GTF2.2) for hg38? The link provided on this website does not work. Is there an updated link available?
Thank you,
Xiao
PS: error messages: (base) xiaolei@Xiaos-Laptop bin %
julia ./whippet-index.jl --fasta /Users/xiaolei/Whippet.jl/anno/hg38.fa.gz --gtf /Users/xiaolei/Whippet.jl/anno/gencode.v45.annotation_sorted.gtf.gz
Whippet v1.6.2 loading... Activating environment at ~/Whippet.jl/Project.toml 5.640663 seconds. Loading GTF file: /Users/xiaolei/Whippet.jl/anno/gencode.v45.annotation_sorted.gtf.gz┌ Warning: Using low quality Transcript Support Levels (TSL 3+) in your GTF file is not recommended! │ For more information on TSL, see: http://www.ensembl.org/Help/Glossary?id=492 │ │ If you would like Whippet to ignore these when building its index, use
--suppress-low-tsl
option! │ └ @ Whippet ~/Whippet.jl/src/refset.jl:159 ERROR: LoadError: ERROR: GTF file is not in valid GTF2.2 format!ERROR: Annotation entries for 'transcript_id' ENST00000430923.7 has already been fully processed and closed. HINT: All GTF lines with the same 'transcript_id' must be adjacent in the GTF file and referring to the same transcript and gene! Stacktrace: [1] error(s::String) @ Base ./error.jl:33 [2] load_gtf(fh::BufferedStreams.BufferedInputStream{Libz.Source{:inflate, BufferedStreams.BufferedInputStream{IOStream}}}; txbool::Bool, suppress::Bool, usebam::Bool, bamreader::Nullable{XAM.BAM.Reader}, bamreads::Int64, bamoneknown::Bool) @ Whippet ~/Whippet.jl/src/refset.jl:165 [3] macro expansion @ ~/Whippet.jl/src/timer.jl:5 [inlined] [4] main() @ Main ~/Whippet.jl/bin/whippet-index.jl:91 [5] top-level scope @ ~/Whippet.jl/src/timer.jl:5 in expression starting at /Users/xiaolei/Whippet.jl/bin/whippet-index.jl:108