0xTCG / biser

A fast tool for detecting and decomposing segmental duplications in genome assemblies
MIT License
43 stars 0 forks source link

biser long run time . #20

Closed ASBioinfo closed 1 year ago

ASBioinfo commented 2 years ago

Hi, Thanks for your tool, I experienced a long runtime (10 Days) with 2TB ram and 96 core system on cattle genome(~3 GB). The below steps are performed accordingly to hardmask the genome:-

  1. Repeatmask used on raw FASTA file with soft masking.
  2. Trf finder is also used on raw FASTA file and output coordinates are saved in bed format.
  3. bedtools maskfasta used on Repeatmasker output file and Trf bed file as input for final hard-masking of the genome

Then hardmask fasta file is analyzed with biser , with the below command:

biser -T TEMP -t 80 -o output.txt --keep-contigs --no-decomposition --max-error=20 --max-edit-error=10 ./ref.masked.fasta

Please suggest if any changes are required .

Thank You

inumanag commented 1 year ago

I am aware of this issue; #19 will have a fix soon.

inumanag commented 1 year ago

Please try v1.4--- the issue should be fixed now.