lh3 / miniasm

Ultrafast de novo assembly for long noisy reads (though having no consensus step)
MIT License
297 stars 68 forks source link

miniasm prints no gfa if PAF file contains self-mapped read #78

Open vellamike opened 4 years ago

vellamike commented 4 years ago

If the PAF file contains self-mapped reads like this:

6cd9b4ce-556b-46f1-b3b4-bec47970595b    5929    0   5914    +   6cd9b4ce-556b-46f1-b3b4-bec47970595b    5929    0   5914    1974    0   255

Then miniasm does not print a GFA file. This is easy to fix by not including these reads in the PAF but is undocumented behaviour.

sjfleck commented 4 years ago

Hi, I'm having a similar issue that you're describing. I'm doing a de novo assembly for my species and I'm not exceeding my memory capacity or time limits, but I'm getting a .paf of 0 b. The weird thing is that I've successfully run this exact assembly before. This is what I'm doing:

minimap2 -x ava-ont -r 10000 -t 16 reads.fastq.gz reads.fastq.gz > overlap.paf miniasm reads.fastq.gz overlap.paf.gz > reads.gfa

The one thing I'm noticing is that my .paf files are different from before even though I don't realize what I'm doing differently. About 6 months ago, my .paf file was only 1,631.2 GB, but when I run the same code with the same reads I get a .paf file that is 3,498.7 GB. I'm thinking that my issue is stemming from minimap and not miniasm. You said this was an easy fix. Do you have a suggestion? Thank you for your time. -Steve

awkh88 commented 4 years ago

Helloo! I have a similar problem to the above. minmap2 seems to produce a solid .paf but then miniasm does not print anything to the .gfa. miniasm only loads 1 hit and 1 sequence in the first step.

Anybody know what is going on ? :) Do we just have to remove the self-mapped reads ?

zhaoxvwahaha commented 1 year ago

@awkh88 Hi, do you solve this problem, I also had empty gfa file.