hzi-bifo / Haploflow

GNU General Public License v3.0
26 stars 3 forks source link

The program keeps outputting logs, but contigs.fa and Cov.tsv aren't updated anymore #24

Closed llechoecho closed 8 months ago

llechoecho commented 8 months ago

Hi, I tested a 2.6G fastq file with the following command line.

haploflow --read-file sample.fastq --k 139 --out out --log log

I don't know why it keeps outputting a log that looks like this:

Calculating paths Paths calculated Graph undercutting threshold of 150 characters (0) Graph 97055: 0 vertices remaining Calculating paths Paths calculated Graph undercutting threshold of 150 characters (0) Graph 97056: 0 vertices remaining Calculating paths

In the output, contigs.fa and Cov.tsv have content but the size does not change anymore. Is there something wrong and how can I adjust it? Thanks.

AlphaSquad commented 8 months ago

Hi, Haploflow calculates contigs for all of the connected components of the assembly graph. Particularly with a high value for k (like 139), a lot of these graphs will be searched for paths but discarded by error correction (the graph did not produce a sequence with at least 150bp). It is unlikely that there is another "big" graph behind these lots of empty ones, but typically Haploflow will finish soon if it is already at this step.

llechoecho commented 8 months ago

Got it. Thanks a lot.