ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
513 stars 94 forks source link

--kbm-parts might cause incomplete slicing on kbm->reads #106

Closed eee4017 closed 5 years ago

eee4017 commented 5 years ago

Dear @ruanjue ,

I found something strange inside proc_alignments_core(), and I am not sure if this is a bug. In wtdbg.h:1385 inside proc_alignments_core(), we can see the slice size ic is ceiling of size/num_index, where num_index is pass down from --kbm-parts

ic = (g->kbm->bins->size + g->num_index - 1) / g->num_index;
ie = 0;

In wtdbg.h:1453 we can find a for loop iterate through each slice. However, ie = ic*num_index in the last slice. Since ic is the ceiling of size/num_index, so this might exceed the original size. Furthermore, kbm->bins->buffer is large enough so the exceeded index would not cause a segmentation fault but g->kbm->bins->buffer[ie - 1].ridx would give out 0, so we got qb=qe=0 in the last slice. Finally, the thread_mdbg_func would not run if qb=qe=0.

        in = g->corr_mode? 1 : g->num_index;
        ...
    for(ii=0;ii<in;ii++){
        ib = ie;
        ie = ib + ic;
        while(ie > 0 && ie < g->kbm->bins->size && g->kbm->bins->buffer[ie - 1].ridx == g->kbm->bins->buffer[ie].ridx) ie ++;
        if(g->corr_mode == 0){
            qb = 0;
            qe = ie? g->kbm->bins->buffer[ie - 1].ridx : 0;
        }

Please check if I have some misunderstanding on this part.

Thank and best regards, Frank Lin

ruanjue commented 5 years ago

Thanks, it is a bug. See fixed https://github.com/ruanjue/wtdbg2/commit/08c2e7b34fc252bcfc2c127aae14ef4abc0422e7