bergmanlab / TELR

TELR is a fast non-reference transposable element detector from long read sequencing data.
https://github.com/bergmanlab/TELR
BSD 2-Clause "Simplified" License
32 stars 11 forks source link

run in ERROR "Repeatmasking VCF insertion sequences failed, exiting..." #5

Closed wangnan9394 closed 2 years ago

wangnan9394 commented 3 years ago

Hi, I am good at test folder. It's a great software. But, when i test a 2.2 Gb reads on my genome. The process was broken, and here is the details. Unfortunately, i could not find where is the key. Could you give me a hand?

Master RepeatMasker Database: /root/miniconda/envs/TELR_env/share/RepeatMasker/Libraries/RepeatMaskerLib.embl ( Complete Database: dc20170127 )
Custom Repeat Library: Csiv4.chromosome.fa.mod.EDTA.TElib.fa

Warning...unknown stuff <
>

analyzing file /work/test6-24_output/intermediate_files/unishu.merge.vcf_ins.fasta
identifying matches to Csiv4.chromosome.fa.mod.EDTA.TElib.fa sequences in batch 1 of 1

No repetitive sequences were detected in /work/test6-24_output/intermediate_files/unishu.merge.vcf_ins.fasta
[Errno 2] No such file or directory: '/work/test6-24_output/intermediate_files/vcf_ins_repeatmask/unishu.merge.vcf_ins.fasta.out.gff'
Repeatmasking VCF insertion sequences failed, exiting...
(TELR_env) root@6b12b58b46ff:/work#

Bests, Nan

wangnan9394 commented 3 years ago

Does the VCF from pbsv (pbmm2 aligner) also work in this pipeline? :)

shunhuahan commented 3 years ago

Hi @wangnan9394,

Shunhua

SergeiF1987 commented 2 years ago

Hi Shunhua,

thanks a lot for this software. I would be happy to use it. Unfortunately, I get the issue mentioned previously by wangnan9394 but for test data.


Master RepeatMasker Database: /mnt/raid/sergey/miniconda/envs/TELR/share/RepeatMasker/Libraries/RepeatMaskerLib.embl ( Complete Database: dc20170127 ) Custom Repeat Library: /mnt/raid/sergey/bio-first/insertion_analysis/test_telr_default/output/intermediate_files/library.fasta

Warning...unknown stuff <

File /mnt/raid/sergey/bio-first/insertion_analysis/test_telr_default/output/intermediate_files/reads.vcf_ins.fasta appears to be empty. [Errno 2] No such file or directory: '/mnt/raid/sergey/bio-first/insertion_analysis/test_telr_default/output/intermediate_files/vcf_ins_repeatmask/reads.vcf_ins.fasta.out.gff' Repeatmasking VCF insertion sequences failed, exiting...


do you know what could be a reason for that?

By the way, when I use my own data this step seems to be passed by I get another error.


analyzing file /mnt/raid/sergey/bio-first/insertion_analysis/test/output/intermediate_files/101N_passed.part-01.te.fa identifying matches to dvir_full-size_TEs.fasta sequences in batch 1 of 1 processing output: cycle 1 cycle 2 cycle 3 cycle 4 cycle 5 cycle 6 cycle 7 cycle 8 cycle 9 cycle 10 Generating output... masking done Done

Successfully created the directory /mnt/raid/sergey/bio-first/insertion_analysis/test/output/intermediate_files/telr_reads

Usage: samtools depth [options] in1.bam [in2.bam [...]] Options: -a output all positions (including zero depth) -a -a (or -aa) output absolutely all positions, including unused ref. sequences -b list of positions or regions -f list of input BAM filenames, one per line [null] -l read length threshold (ignore reads shorter than ) [0] -d/-m maximum coverage depth [8000]. If 0, depth is set to the maximum integer value, effectively removing any depth limit. -q base quality threshold [0] -Q mapping quality threshold [0] -r region --input-fmt-option OPT[=VAL] Specify a single input file format option in the form of OPTION or OPTION=VALUE --reference FILE Reference sequence FASTA FILE [null]

The output is a simple tab-separated table with three columns: reference name, position, and coverage depth. Note that positions with zero coverage may be omitted by default; see the -a option.

/bin/sh: 1: _137386_137390:5972-6022: not found /bin/sh: 1: _137386_137390.realign.sort.bam: not found Traceback (most recent call last): File "/mnt/raid/sergey/miniconda/envs/TELR/bin/telr", line 10, in sys.exit(main()) File "/mnt/raid/sergey/miniconda/envs/TELR/lib/python3.6/site-packages/telr/telr.py", line 129, in main args.thread, File "/mnt/raid/sergey/miniconda/envs/TELR/lib/python3.6/site-packages/telr/TELR_te.py", line 677, in get_af bam, contig_name, start, end, te_interval_size, te_offset File "/mnt/raid/sergey/miniconda/envs/TELR/lib/python3.6/site-packages/telr/TELR_te.py", line 839, in get_te_cov start + te_offset + te_interval_size, File "/mnt/raid/sergey/miniconda/envs/TELR/lib/python3.6/site-packages/telr/TELR_te.py", line 867, in get_median_cov median_cov = statistics.median(covs) File "/mnt/raid/sergey/miniconda/envs/TELR/lib/python3.6/statistics.py", line 380, in median raise StatisticsError("no median for empty data") statistics.StatisticsError: no median for empty data


but probably I need to open another issue for this.

thanks in advance for your reply. Best, Sergei

shunhuahan commented 2 years ago
SergeiF1987 commented 2 years ago

Thanks to you reply! It seems that installation TELR via conda creates not the last version of the program. I have reinstalled it by using git clone than switch version (git checkout 47a0e23f8718df918e6f073c25130c2bdd1bd15f). Test run completed successfully but the second issue with my own data unfortunately wasn't solved. I will open a new issue for that. Thanks!

shunhuahan commented 2 years ago