bergmanlab / TELR

TELR is a fast non-reference transposable element detector from long read sequencing data.
https://github.com/bergmanlab/TELR
BSD 2-Clause "Simplified" License
31 stars 11 forks source link

TELR only finds TE insertions on first chromosome #38

Open lilypeck opened 1 month ago

lilypeck commented 1 month ago

Hello

Thanks very much for this great tool.

I am running TELR on a tree genome which is ~ 830Mb and 12 chromosomes.

It is finishing successfully, however in my output files, it has only identified TE insertions on the first chromosome. This is the case for multiple sets of reads. I have separately assembled the reads and the genome assemblies are complete with all chromosomes present.

Do you know what could be causing this bias to the first chromosome?

Thank you in advance!

Lily

The .log is below

more barcode03/TELR.log 
05/07/2024 04:47:43: INFO: CMD: /u/home/l/ldpeck/.conda/envs/TELR/bin/telr -i /u/project/vlsork/ldpeck/longreads/fastq/barcode03_ALLpass.fastq -r /u
/home/l/ldpeck/genome_resources/GCF_001633185.2_ValleyOak3.2_genomic.fna -l /u/home/l/ldpeck/genome_resources/Qlobata.v3.0.RepeatModeler-open-1.0.8.
consensi.fa.classified -x ont -t 12 -o barcode03
05/07/2024 04:47:43: INFO: Parsing input files...
05/07/2024 04:47:43: INFO: Raw reads are provided
05/07/2024 04:47:43: INFO: Start alignment...
05/07/2024 21:18:13: INFO: Sort and index BAM...
05/07/2024 22:00:28: INFO: First alignment finished in 17 hours 12 minutes 44 seconds
05/07/2024 22:00:28: INFO: Detecting SVs from BAM file...
05/07/2024 23:14:22: INFO: SV detection finished in 1 hours 13 minutes 54 seconds
05/07/2024 23:14:22: INFO: Parse structural variant VCF...
05/07/2024 23:27:14: INFO: Perform local assembly of non-reference TE loci...
05/07/2024 23:29:43: INFO: Local assembly finished in 2 minutes 27 seconds
05/07/2024 23:29:43: INFO: Annotate contigs...
05/07/2024 23:30:10: INFO: Estimating allele frequency...
05/07/2024 23:41:35: INFO: Perform local realignment...
05/07/2024 23:41:57: INFO: Local realignment finished in 22 seconds
05/07/2024 23:42:23: INFO: Allele frequency estimation finished in 48 seconds
05/08/2024 02:06:20: INFO: Map contigs to reference...
05/08/2024 02:11:03: INFO: Write output...
05/08/2024 02:11:21: INFO: TELR finished in 21 hours 23 minutes 38 seconds

There are 67 TE insertions in the .bed file, which are all on the first chromosome -

barcode03/barcode03_ALLpass.telr.bed 
NC_044904.1 1032836 1032841 rnd-5_family-4302   .   -
NC_044904.1 1053999 1054003 rnd-1_family-226    .   -
NC_044904.1 1099869 1099869 rnd-1_family-135    .   -
NC_044904.1 1115738 1115742 rnd-1_family-28|rnd-1_family-414|rnd-1_family-44|rnd-1_family-936|rnd-5_family-3646 .   -
NC_044904.1 1237003 1237006 rnd-1_family-27|rnd-1_family-275|rnd-1_family-29|rnd-3_family-1127|rnd-3_family-172 .   -
NC_044904.1 1317284 1317284 rnd-1_family-12|rnd-1_family-4  .   -
NC_044904.1 1344949 1344950 rnd-1_family-268    .   +
NC_044904.1 1453483 1453485 rnd-1_family-239|rnd-1_family-452|rnd-5_family-394  .   -
NC_044904.1 1477501 1477504 rnd-1_family-429|rnd-5_family-2250  .   -
NC_044904.1 1560954 1560969 rnd-1_family-358    .   +
NC_044904.1 1564438 1564447 rnd-5_family-1540   .   -
NC_044904.1 1770139 1770141 rnd-1_family-43 .   -
NC_044904.1 1784603 1784608 rnd-4_family-269    .   -
NC_044904.1 1843359 1843359 rnd-1_family-22|rnd-1_family-24 .   +
NC_044904.1 1856666 1856666 rnd-1_family-66|rnd-5_family-3370   .   -
NC_044904.1 1858066 1858066 rnd-1_family-1  .   -
NC_044904.1 1878004 1878007 rnd-4_family-366|rnd-6_family-4077  .   -
NC_044904.1 1900780 1900782 rnd-1_family-11|rnd-1_family-254|rnd-1_family-3|rnd-1_family-7  .   -
NC_044904.1 1917028 1917038 rnd-1_family-90 .   -
NC_044904.1 1961400 1961400 rnd-1_family-38 .   -
NC_044904.1 2058100 2058100 rnd-5_family-2523   .   -
NC_044904.1 2096028 2096031 rnd-1_family-19|rnd-1_family-20|rnd-1_family-58|rnd-3_family-369    .   -
NC_044904.1 2125712 2125750 rnd-3_family-626|rnd-5_family-3531  .   -
NC_044904.1 2152823 2152823 rnd-1_family-364    .   -
NC_044904.1 2332827 2332835 rnd-1_family-365|rnd-1_family-588   .   -
NC_044904.1 2340392 2340392 rnd-1_family-96 .   +
NC_044904.1 2409243 2409243 rnd-1_family-239    .   -
NC_044904.1 2445982 2445984 rnd-1_family-45 .   -
NC_044904.1 2451032 2451039 rnd-1_family-1  .   -
NC_044904.1 2463943 2463949 rnd-1_family-219    .   -
NC_044904.1 2566660 2566665 rnd-1_family-259|rnd-1_family-715|rnd-1_family-84|rnd-6_family-1302 .   -
NC_044904.1 2622522 2622531 rnd-1_family-43 .   -
NC_044904.1 2639432 2639440 rnd-1_family-119    .   -
NC_044904.1 2776399 2776498 rnd-5_family-2229   .   -
NC_044904.1 2778531 2778542 rnd-1_family-117    .   -
NC_044904.1 2831119 2831169 rnd-1_family-34 .   -
NC_044904.1 2899997 2900109 rnd-1_family-268|rnd-1_family-43|rnd-4_family-366   .   -
NC_044904.1 2950908 2950966 rnd-1_family-139    .   -
NC_044904.1 2982105 2982124 rnd-1_family-288    .   -
NC_044904.1 3030509 3030509 rnd-1_family-60 .   -
NC_044904.1 3162939 3162942 rnd-1_family-35 .   -
NC_044904.1 3183111 3183126 rnd-1_family-147    .   -
NC_044904.1 3221638 3221638 rnd-1_family-870|rnd-1_family-941|rnd-5_family-2523 .   +
NC_044904.1 3308754 3308754 rnd-1_family-43 .   -
NC_044904.1 3310630 3310630 rnd-1_family-515|rnd-5_family-1170  .   -
NC_044904.1 3447794 3447794 rnd-1_family-58 .   -
NC_044904.1 3448551 3448559 rnd-1_family-365|rnd-1_family-588   .   +
NC_044904.1 346059  346064  rnd-4_family-1862   .   +
NC_044904.1 3547476 3547480 rnd-1_family-239|rnd-1_family-452|rnd-5_family-394  .   -
NC_044904.1 3611549 3611559 rnd-1_family-358    .   +
NC_044904.1 3635277 3635282 rnd-1_family-226    .   -
NC_044904.1 3645400 3645400 rnd-1_family-37 .   -
NC_044904.1 3723532 3723541 rnd-1_family-35 .   -
NC_044904.1 3766267 3766267 rnd-1_family-246    .   -
NC_044904.1 3770943 3770950 rnd-5_family-6424   .   +
NC_044904.1 3878453 3878466 rnd-1_family-147    .   -
NC_044904.1 3886041 3886041 rnd-1_family-1|rnd-1_family-11|rnd-1_family-3   .   -
NC_044904.1 3963150 3963153 rnd-4_family-840    .   -
NC_044904.1 3973117 3973117 rnd-5_family-1212   .   -
NC_044904.1 427826  427831  rnd-1_family-718|rnd-1_family-883|rnd-5_family-1227|rnd-5_family-2674   .   -
NC_044904.1 430013  430013  rnd-1_family-116    .   -
NC_044904.1 431298  431298  rnd-1_family-365|rnd-1_family-588   .   -
NC_044904.1 445112  445121  rnd-1_family-214|rnd-3_family-730   .   -
NC_044904.1 450130  450137  rnd-1_family-45 .   -
NC_044904.1 689931  689934  rnd-1_family-61 .   -
NC_044904.1 782394  782395  rnd-4_family-840    .   -
NC_044904.1 885492  885497  rnd-1_family-1|rnd-1_family-11  .   -