Closed kbseah closed 7 months ago
Hi, Thanks for reporting the bug. It seems that the output from -c 1 is the one that is wrong. I'm checking this and hopefully tomorrow it will be solved.
A.
What I found is that in file running_dnoise.py, when de.cores == 1, in the for loop in which I create a dataframe with all the mothers in which the daughters are merged, each line needed to be transposed. As it was not, theese sequences were lost. If you check your outputs you will see that the difference are found in the first lines where in the output1 the first three sequences (those that are identified as mothers) are not there.
I have changed this for both running with and without entropy. In case this problem persists, please, report it.
expect the commit tomorrow
Thank you so much, Adrià
thanks Adria for looking into this! looking forward to the fix. could you please also make a new release on Github so that this can be propagated to Bioconda, too?
New release Done, thanks again for reporting the bug
The denoised sequences output by DnoisE are different when using one core (option
-c 1
), vs. using more than one core. The output appears to be identical for values of-c
other than one that I have tested.The number of cores used should not affect the program output, so this is a bug that affects the program's correctness.
The bug appears to have been introduced somewhere between tags
v.1.2.0
andv.1.3.0
:dnoise
installed withpip install .
, runningpandas=2.0.0
: output with-c 1
different from otherspython setup.py install
, runningpandas=2.0.0
: output with-c 1
different from othersv.1.2.0
), installed withpython3 setup.py install
runningpandas=1.5.3
: no difference in output with different values of-c
However, the denoised sequences output by DnoisE with >=2 cores does not appear to differ between versions, except for the sort order the sequences are the same for the versions I have checked.
I have only tested the denoising of Fasta input files, but not the other usage modes of the program.
Reproducing the issue
I used the test file at
test-DnoisE/sample1000.fasta
. I installed each version of DnoisE to a venv, with different versions ofpandas
(see above) depending on the DnoisE version. Replace $DNOISE with eitherdnoise
(v1.4.0) orpython3 src/DnoisE.py
(earlier versions), and run the following commands:Compare the outputs