bergmanlab / TELR

TELR is a fast non-reference transposable element detector from long read sequencing data.
https://github.com/bergmanlab/TELR
BSD 2-Clause "Simplified" License
32 stars 11 forks source link

KeyError type erro #26

Open unavailable-2374 opened 1 year ago

unavailable-2374 commented 1 year ago

hello

I'm having some problems when running TELR.

The logs are as follows.

""" [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 2023563 reads [bam_sort_core] merging from 0 files and 128 in-memory blocks... [W::vcf_parse] FILTER 'STRANDBIAS' is not defined in the header Warning...unknown stuff <

Warning...unknown stuff <

Traceback (most recent call last): File "/public/home/tools/miniconda3/envs/TELR/bin/telr", line 8, in sys.exit(main()) File "/public/home/tools/miniconda3/envs/TELR/lib/python3.6/site-packages/telr/telr.py", line 129, in main args.thread, File "/public/home/tools/miniconda3/envs/TELR/lib/python3.6/site-packages/telr/TELR_te.py", line 596, in get_af vcf_parsed, out, sample_name, bam, raw_reads, telr_reads_dir, read_type="all" File "/public/home/tools/miniconda3/envs/TELR/lib/python3.6/site-packages/telr/TELR_assembly.py", line 429, in prep_assembly_inputs extract_reads(subset_fa, read_ids, subset_fa_reorder) File "/public/home/tools/miniconda3/envs/TELR/lib/python3.6/site-packages/telr/TELR_assembly.py", line 469, in extract_reads output_handle.write(record_dict.get_raw(entry)) File "/public/home/caoshuo/tools/miniconda3/envs/TELR/lib/python3.6/site-packages/Bio/File.py", line 450, in get_raw return self._proxy.get_raw(self._offsets[key]) KeyError: 'm64252e_220613_012723/92670297/ccs' """

then I grep the id

""" [login04 pan_TE]$ grep "92670297" fq/mgx.ccs.fq @m64252e_220613_012723/92670297/ccs """

this is the TELR.log

" 12/16/2022 11:37:23: INFO: CMD: /public/home/tools/miniconda3/envs/TELR/bin/telr -i bam/mgx.ccs_sort.bam -o MGX -r /public/home/raw_data/genome/PN.fa -l /public/home/project/annotation/TE/vitis.TElib.novel.fa -t 128 12/16/2022 11:37:23: INFO: Parsing input files... 12/16/2022 11:37:23: INFO: BAM file is provided, skip alignment step 12/16/2022 11:37:23: INFO: Converting input BAM file to fasta... 12/16/2022 11:55:58: INFO: Sort and index BAM... 12/16/2022 12:01:23: INFO: Detecting SVs from BAM file... 12/16/2022 12:52:58: INFO: SV detection finished in 51 minutes 34 seconds 12/16/2022 12:52:59: INFO: Parse structural variant VCF... 12/16/2022 13:07:07: INFO: Perform local assembly of non-reference TE loci... 12/16/2022 13:33:16: INFO: Local assembly finished in 23 minutes 28 seconds 12/16/2022 13:33:16: INFO: Annotate contigs... 12/16/2022 15:30:56: INFO: Estimating allele frequency... "

Please let me know if there is anything else I can offer.

thanks.

shunhuahan commented 1 year ago

Hi @unavailable-2374,

Thanks for reporting this issue. The current information is not sufficient for me to figure out what's going on. Could you send the SV detection results (<sample>.telr.vcf), assembled contigs (<sample>.telr.contig.fasta), and subset_fa file (<sample>.subset.fa) to hanshunhua0829@gmail.com? Thanks!

Shunhua

Anees-caas commented 3 months ago

Hi @shunhuahan @unavailable-2374 recently I am using TELR and it's giving the same error. I have downloaded telr using mamba. (TELR) ug1985@gs63:~/TEs/bam telr -i ZGSP-001.sorted.bam -r /home/ug1985/TEs/repeatmasker/genome.fasta -l /home/ug1985/TEs/repeatmasker/consensi.fa -o /home/ug1985/TEs/telrout/ZGSP-001 -t 16 Directory /home/ug1985/TEs/telrout/ZGSP-001 exists Directory /home/ug1985/TEs/telrout/ZGSP-001/intermediate_files exists [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 723374 reads [bam_sort_core] merging from 32 files and 16 in-memory blocks... Estimating parameter... No MD string detected! Check bam file! Otherwise generate using e.g. samtools. MD: TESTb0026688-fdc7-4b2f-85f2-4820c7b2bef5 Directory /home/ug1985/TEs/telrout/ZGSP-001/intermediate_files/vcf_ins_repeatmask exists RepeatMasker version open-4.0.7 Search Engine: NCBI/RMBLAST [ 2.6.0+ ] Rebuilding RepeatMaskerLib.embl library

Warning...unknown stuff <

Building general libraries in: /home/ug1985/.conda/envs/TELR/share/RepeatMasker/Libraries/dc20170127/general File /home/ug1985/TEs/telrout/ZGSP-001/intermediate_files/ZGSP-001.sorted.vcf_ins.fasta appears to be empty. [Errno 2] No such file or directory: '/home/ug1985/TEs/telrout/ZGSP-001/intermediate_files/vcf_ins_repeatmask/ZGSP-001.sorted.vcf_ins.fasta.out.gff' Repeatmasking VCF insertion sequences failed, exiting...