NMRLipids / MATCH

GNU General Public License v2.0
4 stars 14 forks source link

Quality measure for form factor from simulations #65

Closed ohsOllila closed 1 year ago

ohsOllila commented 5 years ago

Currently, the fitness code https://github.com/NMRLipids/MATCH/blob/master/scripts/NMRL3_analysis/analysis_NMRL3.py gives pretty good quality for Berger POPC simulation without cholesterol https://github.com/NMRLipids/NmrLipidsCholXray/blob/master/FIGS/FFfitness-eps-converted-to.pdf, even though the form factor minima do not coincide with experiments: https://github.com/NMRLipids/NmrLipidsCholXray/blob/master/FIGS/FormFactors-eps-converted-to.pdf (only 50% cholesterol and without cholesterol calculated with corrected code, others should be recalculated). This is probably because locations of minima have a lower weight in the fitness code than the ratios of peak heights at the maxima. I am not sure if this is reasonable?

One option would be to use the equation (3) from the SIMtoEXP publication, but this one requires error bars for the experimental form factor. Based on discussions with Georg Pabst, it seems that these errors are not always easy to define. Maybe we could formulate a similar equation without error bars?

hsantila commented 5 years ago

You are right, in a sense the peak height rations are in different "units" which cause them to contribute to the fitness with different weight than the locations.

It all boils down to what one wants to prioritize when measuring the similarity (the peak locations, peak heights, the locations of the minimas?).

In sim2exp the quality is assessed over all points but weighted on the (inverse) experimental accuracy of each point. We, however, don't have the error bars and should probably make some informed choice which parts of the curve to prioritize.

Reminder: if we choose a measure that is dependable on the y-values, pre-fitting the experimental data to simulated one is needed.

ohsOllila commented 1 year ago

This discussion is now elsewhere, see the current status from http://nmrlipids.blogspot.com/2022/09/nmrlipids-databank-form-factor-quality.html. Therefore, I will close this issue.