Jianwei-Zhang / DEGAP

Dynamic Elongation of a Genome Assembly Path
14 stars 0 forks source link

Inquiry about DEGAP GapFiller Usage and Parameters #2

Closed ChuanzhengWei closed 2 weeks ago

ChuanzhengWei commented 3 weeks ago

Hi,

I am currently utilizing the gapfiller feature of DEGAP to address gaps in sequence data and have several questions regarding its operation:

  1. What is the recommended length for sequences in the --seqleft and --seqright input files?
  2. In cases where a contig lacks a 5’ telomere, is it feasible to alter the left file to contain a repeat of CCCTAAA, use the contig for the right file, and set the flag to "right" to complete telomere assembly?
  3. I am also concerned about the duration of software operations as one of my tasks has been running for seven days without completion.

Could you please provide guidance on these issues? Your expertise and advice will be greatly appreciated. Thank you for your time and assistance.

Sincerely

Jianwei-Zhang commented 3 weeks ago

(1) For telomere elongation, you can use either the repeat sequence CCCTAAA as the left file or an empty file. In GapFiller mode, DEGAP typically halts the process at telomere regions due to unreliable alignment results or the inability to find appropriate elongation sequences.

(2) If DEGAP takes an excessive amount of time to run, you can check the output sequence of the last round in the process folder. This will help determine whether the delay is caused by extension errors due to highly similar regions in the genome or by inaccurate stop sequences that fail to terminate correctly. Additionally, you can control the extension length by setting the “—MaximumExtensionLength” parameter.

ChuanzhengWei commented 3 weeks ago

Thanks for your reply. I will try agin according to your suggestion.

Best regards