Open JohnUrban opened 3 weeks ago
I can say that, since this feature request, I made the adjustment that I suggested, and in some cases, allowing 25kb, 50kb, and/or 75kb overlaps lead to higher contiguity for ONT-UL asseblies -- and 15kb-20kb for HiFi. (I cannot tell you if the extra contiguity was accurate or not though.)
In principle, it should be possible to increase, but this will require extensive testing. Is there evidence that you are getting better assemblies with increased minimum overlap?
When the coverage is high enough and the reads are long enough, I did see contiguity increase.
As for other metrics, if you don't mind waiting, I will report back anything I learn about them in the coming month or two.
As you know better than anyone, Flye sets an overlap length (limited to 10 kb the longest) based on read N50 seemingly w/o considering the amount of coverage. So it sets the same overlap for 30X coverage as for 300X coverage.
I have 120X ultra-long nanopore and 600X HiFi, so I wanted to test the longer overlap cutoffs since I technically have far more coverage than needed for a great Flye assembly.
The title says it all.
I have high accuracy nanopore data with read N50 of >100 kb (representing >30-50X coverage) and I would like to try minimum overlaps of 25kb and 50 kb, but get this error:
It looks like this could be as simple as changing the min and max values in the argument parser around line 624 in
flye/main.py
:...but I don't know how that will affect anything downstream that may assume a max of 10 kb.....
If there is a reason longer overlaps are not allowed, please let me know.
Many thanks.
(p.s. I will try messing around with the arg parser in the mean time)