Implement maximum entropy deconvolution

🚶➡️🏃 Proposed Enhancements

In some spectroscopic fields like UV/Vis- or MIR-spectroscopy (of liquid systems), peaks can show very strong overlap. This limits the usefulness of many spectroscopic analysis techniques, e.g., Multivariate Fitting with reference spectra. Having the peaks more resolved while keeping the noise suppressed would be a nice additional pre-processing step. Actually, using derivative spectroscopy is only a workaround to achieve just this. This is easily depicted, e.g., looking at the derivaties of the MIR-spectra of some enzymes for protein analysis taken from

Baldassarre, et al., Simultaneous Fitting of Absorption Spectra and Their Second Derivatives for an Improved Analysis of Protein Infrared Spectra, Molecules 2015, 20(7), 12599-12622

The second and fourth order derivative reveal the overlapped peaks, but that's not easily achieved in practice where noise limits the usefulness of derivation.

However, in Lórenz-Fonfriá & Padrós, Maximum Entropy Deconvolution of Infrared Spectra: Use of a Novel Entropy Expression Without Sign Restriction, Applied Spectroscopy, 2005, Volume 59, Number 4, a quite powerful deconvolution technique based on Maximum Entropy Deconvolution was proposed. This can achieve such a peak resolution as well, but more resistant to noise. All in all, it circumvents the smoothing which would be mandatory as a pre-processing for taking derivatives. From a resolution perspective, the results look promising (Figure A is sharpened to Figure C):

🧑‍💻 Implementation details

For this approach to work, weights have to provided (could be achieved with the functionality added for #44 and #120). The publication provides some implementation details on how to solve the underlying Nonlinear Optimization problem via a hand-crafted Conjugate Gradient method, but I think scipy.optimize.minimize offers more functionality to solve this in a graceful fashion:

the problem incorporates a penalty weight $\lambda$ just like the Whittaker-Smoother. However, adjusting it to meet the reduced Chi-squared criterion proposed requires the evaluation of multiple $\lambda$-values which will be way more expensive than for the Whittaker-Smoother that has a straightforward linear solution. Reformulating the problem to Maximize the entropy with the constraint that the Chi-squared criterion is roughly 1 should be more adequate and require only a single (but longer) optimization run.
this would also allow to formulate the Jacobian and Hessian of the system in a more straigthforward way. Having this kind of gradient information is crucial for Nonlinear Optimization if we are aiming for speed. The Jacobian and the Hessian for the Chi-squared constraint can then be formulated as sparse matrices/linear operators that define the convolution operations in a very efficient way. On the other hand, the Jacobian and Hessian of the entropy terms can then be computed stand-alone rather than having the weighted Sum of Squared Residuals term to consider.

This is already a quite deep dive 🤿 into optimization theory, and I hope I can visualise 📊 it in a better way once the basic implementation is settled.

paucablop / chemotools

Implement maximum entropy deconvolution #123

🚶➡️🏃 Proposed Enhancements

🧑‍💻 Implementation details