paucablop / chemotools

Integrate your chemometric tools with the scikit-learn API 🧪 🤖
https://paucablop.github.io/chemotools/
MIT License
45 stars 6 forks source link

Implement maximum entropy deconvolution #123

Open MothNik opened 3 months ago

MothNik commented 3 months ago

🚶➡️🏃 Proposed Enhancements

In some spectroscopic fields like UV/Vis- or MIR-spectroscopy (of liquid systems), peaks can show very strong overlap. This limits the usefulness of many spectroscopic analysis techniques, e.g., Multivariate Fitting with reference spectra. Having the peaks more resolved while keeping the noise suppressed would be a nice additional pre-processing step. Actually, using derivative spectroscopy is only a workaround to achieve just this. This is easily depicted, e.g., looking at the derivaties of the MIR-spectra of some enzymes for protein analysis taken from

Baldassarre, et al., Simultaneous Fitting of Absorption Spectra and Their Second Derivatives for an Improved Analysis of Protein Infrared Spectra, Molecules 2015, 20(7), 12599-12622

image

The second and fourth order derivative reveal the overlapped peaks, but that's not easily achieved in practice where noise limits the usefulness of derivation.

However, in Lórenz-Fonfriá & Padrós, Maximum Entropy Deconvolution of Infrared Spectra: Use of a Novel Entropy Expression Without Sign Restriction, Applied Spectroscopy, 2005, Volume 59, Number 4, a quite powerful deconvolution technique based on Maximum Entropy Deconvolution was proposed. This can achieve such a peak resolution as well, but more resistant to noise. All in all, it circumvents the smoothing which would be mandatory as a pre-processing for taking derivatives. From a resolution perspective, the results look promising (Figure A is sharpened to Figure C):

image

🧑‍💻 Implementation details

For this approach to work, weights have to provided (could be achieved with the functionality added for #44 and #120). The publication provides some implementation details on how to solve the underlying Nonlinear Optimization problem via a hand-crafted Conjugate Gradient method, but I think scipy.optimize.minimize offers more functionality to solve this in a graceful fashion:

This is already a quite deep dive 🤿 into optimization theory, and I hope I can visualise 📊 it in a better way once the basic implementation is settled.