[FIX] Replace lfdr scipy interpolation function with FITPACK

PyProphet / pyprophet

PyProphet: Semi-supervised learning and scoring of OpenSWATH results.

http://www.openswath.org

BSD 3-Clause "New" or "Revised" License

29 stars 21 forks source link

[FIX] Replace lfdr scipy interpolation function with FITPACK #41

Closed grosenberger closed 5 years ago

grosenberger commented 5 years ago

This PR provides a workaround for issue B reported by @bretttully in https://github.com/PyProphet/pyprophet/issues/38.

In this case, the target and decoy protein score distributions are completely separate, resulting in monotonic p-values as estimated by the empirical p-value function, which are not compatible with the spline fitting method used in lfdr. By replacing the function with a FITPACK wrapper, NAs are produced in this extreme case to indictate that the estimation of local FDR is not applicable.

bretttully commented 5 years ago

When you say that NaN will be returned, is this what would be reported in the final file? Or would these pqps be assigned an m-score of 0 to indicate there is no chance of a false discovery? Will this be consistent with other software that may be used downstream?

grosenberger commented 5 years ago

The q-value / FDR can be estimated as before and will be 0. The posterior error probability on the other hand will be NA, because the method for estimation is not applicable. If we want to replace it with 0 for downstream, we would need to print a warning.

bretttully commented 5 years ago

At the moment, we aren't using the posterior error prob, so this looks fine for our purposes.

Out of curiosity, do you have plans to put cases like this into a test harness?

grosenberger commented 5 years ago

It would definitely be great to improve the test coverage of PyProphet. Right now, only the low level implementation is really tested and for the more advanced workflows, a few smoke tests exist. It would be great if this could be extended, but unfortunately my time is a bit restricted.