Closed kfritzsc closed 8 years ago
@kfritzsc Great pull request! Thanks for contributing this to nmrglue. The new function and modification to the existing code seem reasonable to me.
I have very little experience with Bruker spectrometers and even less with referencing spectra collected on these instruments. Your analysis seems well though out and is correct in your example.
I think the only question before merging is the default value for the scale_data
parameter to bruker.read_pdata
. On the one hand a number of things can effect the scaling of NMR data, the amplifier gain, number of transients, the definition of the FFT used, etc and therefore matching the intensity of spectra between instruments and processing software can be challenging. On the hand your example shows that the intensity of data processed in nmrglue and by Bruker's software can be made to match.
Another consideration is that if the data is not scaling each data point requires 4-bytes of memory (stored as int32) where as to maintain the same precision scaled data must be stored in 8-bytes (float64, float32 only has 24 bits of precision). So scaling requires twice as much memory but allow for phasing and other operation with less change of overflow as @atomman pointed out in #28.
I'm fine with either default as long as the users can still override the choice. Since I don't use Bruker data I'll leave leave the choice to those who use such data on a more regular basis.
One final note, the final reversing of data1 in the example is not needed if fft_positive
is used in place of fft
. Perhaps Bruker is using this definition of the FFT for their processing?
@jjhelmus that is good news, thank you for your feedback!
Bruker read_pdata scale_data option I had not considered the memory increase. I now agree that scale_data should be an option. I still think that the default of scale_data should be True.
I think that most users will not mind the memory increases: I imagine most using bruker.read_pdata
will be working with 1D or 2D datasets. If memory is a concern, as it is with high dimensional datasets, I think most people would choose to work with the raw fid data and use bruker.read
(otherwise they will be transferring unnecessarily large datasets from computer-to-computer). I think this default will prevent more casual users from making errors that have scientific repercussions. With the option, more experienced users concerned about memory can load the data as ints and then be responsible for scaling the data properly if necessary.
Bruker fft
If I use fft_positive
I can remove the call to rev
. I have change the code.
To answer exactly what Bruker does will take a little more investigation but I offer the following remarks: Bruker has a paramater in the 'procs' dictionary called 'REVERSE' I almost always have this set to True.
If set this value to False and use Bruker (TopSpin v3) fft routines I get a spectrum with an incorrect ppm scale (it is reversed :) ) and the zeroth order phase is off by 180 degrees. In the nmrglue example I had to use a phase 180 degrees from the the Bruker value regardless if I used 'fft' followed by 'rev' or if I use fft_positive
. The differences in these fft conventions does not make sense to me.
At minimum however we can be sure that the final example is correct, the spectrum is of an n-alkane and as expected the largest peak resonants at 32.8 ppm (on a neat TMS scale).
I agree with your recommendation for the scale_data
parameter, changing it to True makes sense. I think with that change this PR should be good to merge.
The signs of the FFT and phase correction get confusing, the direction of the rotating frame (clockwise vs counterclockwise) the sign of the gyromagnetic ratio, if you use a left-hand or right-hand coordinate system and a few other ideas can come into play. Luckily, most of these are just matters of convention and can be ignored once you figure out which one gives you the correct answer.
I think with that change this PR should be good to merge.
Just to make sure we are on the same page: I made this change in b00d395.
Sorry about that, missed that one, merging. Thanks for your work on this @kfritzsc!
Hello nmrglue community:
In this bruker-ppm_scale branch the fileio.bruker module will use information from Bruker 'procs' files when available. If the 'procs' information is available the spectrum read by
bruker.read_pdata
,bruker.read
orbruker.read_lowmem
can be referenced correctly by using the updatedbruker.guess_udic
andfileiobase.uc_from_udic
. Please see 'examples/bruker_processed_1d/bruker_processed_1d.py' for an example. I will keep the example dataset accessible (which is referenced properly) here until these issues are resolved. This pull-request in part addresses issue #13 and I think it is a first poke at issue #21.Also, the example given should re-open issue #28. To match the fid/ser data the 'bruker.read_pdata' must be scaled with
scale_data=True
. The scale_data option should always be set to True it is a bug not to scale the data.Description Unless I am misunderstanding something, the information to correctly and reliably reference Bruker spectra must be read from the
procs
file.fileiobase.unit_conversion scheme
I think the 'acqus' file(s) must also be parsed. So now if 'acqus' files are in the normal place or given in a list they are read into the dict. (There is many other ways to reference the data using the 'procs' data without the 'acqus' data but thenunit_conversion
would have to be more complicated.)This is my first contribution to nmrglue. Please let me know how I can improve this patch.