wfondrie / mokapot

Fast and flexible semi-supervised learning for peptide detection in Python
https://mokapot.readthedocs.io
Apache License 2.0
40 stars 14 forks source link

max_workers issue #104

Closed wenbostar closed 9 months ago

wenbostar commented 10 months ago

The cross validation might not work as expected when using multiple threads.

(mokapot_test) mokapot -w 3 feature.pin
[INFO]
[INFO] === Analyzing Fold 1 ===
[INFO] === Analyzing Fold 2 ===
[INFO] === Analyzing Fold 3 ===
[INFO] Finding initial direction...
[INFO] Finding initial direction...
[INFO] Finding initial direction...
[INFO]  - Selected feature score with 21657 PSMs at q<=0.01.
[INFO]  - Selected feature score with 21657 PSMs at q<=0.01.
[INFO]  - Selected feature score with 21657 PSMs at q<=0.01.

If I set -w as 3, it always has the same number of PSMs passed at a q-value 0.01 cutoff as shown above. When I set -w as 1, they are different. I also tried to print out the training data for each iteration, it looks like they are identical from different folds by looking at the first several rows.

wfondrie commented 10 months ago

I've successfully reproduced this, but don't see an obvious cause in the code. I'll have to do more digging.

Hopefully I'll be able to get a big new release out next week with a fix included.