jjhelmus / nmrglue

A module for working with NMR data in Python
BSD 3-Clause "New" or "Revised" License
211 stars 86 forks source link

Fix: Bruker ppm scale and intensity scale. #43

Closed kfritzsc closed 8 years ago

kfritzsc commented 8 years ago

Hello nmrglue community:

In this bruker-ppm_scale branch the fileio.bruker module will use information from Bruker 'procs' files when available. If the 'procs' information is available the spectrum read by bruker.read_pdata, bruker.read or bruker.read_lowmem can be referenced correctly by using the updated bruker.guess_udic and fileiobase.uc_from_udic. Please see 'examples/bruker_processed_1d/bruker_processed_1d.py' for an example. I will keep the example dataset accessible (which is referenced properly) here until these issues are resolved. This pull-request in part addresses issue #13 and I think it is a first poke at issue #21.

Also, the example given should re-open issue #28. To match the fid/ser data the 'bruker.read_pdata' must be scaled with scale_data=True. The scale_data option should always be set to True it is a bug not to scale the data.

Description Unless I am misunderstanding something, the information to correctly and reliably reference Bruker spectra must be read from the procs file.

This is my first contribution to nmrglue. Please let me know how I can improve this patch.

jjhelmus commented 8 years ago

@kfritzsc Great pull request! Thanks for contributing this to nmrglue. The new function and modification to the existing code seem reasonable to me.

I have very little experience with Bruker spectrometers and even less with referencing spectra collected on these instruments. Your analysis seems well though out and is correct in your example.

I think the only question before merging is the default value for the scale_data parameter to bruker.read_pdata. On the one hand a number of things can effect the scaling of NMR data, the amplifier gain, number of transients, the definition of the FFT used, etc and therefore matching the intensity of spectra between instruments and processing software can be challenging. On the hand your example shows that the intensity of data processed in nmrglue and by Bruker's software can be made to match.

Another consideration is that if the data is not scaling each data point requires 4-bytes of memory (stored as int32) where as to maintain the same precision scaled data must be stored in 8-bytes (float64, float32 only has 24 bits of precision). So scaling requires twice as much memory but allow for phasing and other operation with less change of overflow as @atomman pointed out in #28.

I'm fine with either default as long as the users can still override the choice. Since I don't use Bruker data I'll leave leave the choice to those who use such data on a more regular basis.

jjhelmus commented 8 years ago

One final note, the final reversing of data1 in the example is not needed if fft_positive is used in place of fft. Perhaps Bruker is using this definition of the FFT for their processing?

kfritzsc commented 8 years ago

@jjhelmus that is good news, thank you for your feedback!

Bruker read_pdata scale_data option I had not considered the memory increase. I now agree that scale_data should be an option. I still think that the default of scale_data should be True.

I think that most users will not mind the memory increases: I imagine most using bruker.read_pdata will be working with 1D or 2D datasets. If memory is a concern, as it is with high dimensional datasets, I think most people would choose to work with the raw fid data and use bruker.read (otherwise they will be transferring unnecessarily large datasets from computer-to-computer). I think this default will prevent more casual users from making errors that have scientific repercussions. With the option, more experienced users concerned about memory can load the data as ints and then be responsible for scaling the data properly if necessary.

Bruker fft If I use fft_positive I can remove the call to rev. I have change the code.

To answer exactly what Bruker does will take a little more investigation but I offer the following remarks: Bruker has a paramater in the 'procs' dictionary called 'REVERSE' I almost always have this set to True. If set this value to False and use Bruker (TopSpin v3) fft routines I get a spectrum with an incorrect ppm scale (it is reversed :) ) and the zeroth order phase is off by 180 degrees. In the nmrglue example I had to use a phase 180 degrees from the the Bruker value regardless if I used 'fft' followed by 'rev' or if I use fft_positive. The differences in these fft conventions does not make sense to me.

At minimum however we can be sure that the final example is correct, the spectrum is of an n-alkane and as expected the largest peak resonants at 32.8 ppm (on a neat TMS scale).

jjhelmus commented 8 years ago

I agree with your recommendation for the scale_data parameter, changing it to True makes sense. I think with that change this PR should be good to merge.

The signs of the FFT and phase correction get confusing, the direction of the rotating frame (clockwise vs counterclockwise) the sign of the gyromagnetic ratio, if you use a left-hand or right-hand coordinate system and a few other ideas can come into play. Luckily, most of these are just matters of convention and can be ignored once you figure out which one gives you the correct answer.

kfritzsc commented 8 years ago

I think with that change this PR should be good to merge.

Just to make sure we are on the same page: I made this change in b00d395.

jjhelmus commented 8 years ago

Sorry about that, missed that one, merging. Thanks for your work on this @kfritzsc!