Dynamically handle payload configurations and sensor calibrations

richardsc commented 3 years ago

This is a redo of an issue that originally only referred to the DFO east coast gliders which have an integrated SBE43 oxygen sensor instead of the Rinko AROD, but really it should be generalized to include a way of configuring pyglider to handle any glider payload configuration through the setup in the deployment.yml (the current version is hard-coded to assume GP-CTD+FLBBCD+AROD only).

If this should perhaps be split into more issues (e.g. to handle the general case vs the SBE43-specific one) just let me know.

A complication with the SBE43 is that the data stream from the sensor is raw frequency, and requires calculations using the calibration coefficients to turn it into a usable concentration. Below is an example data set from one of our recent missions, and I will also post details of the raw-to-calibrated data conversion in a follow-up comment.

sea021m059_sbe43_example_data.zip

richardsc commented 3 years ago

Calibration details for the above instrument (screenshot from the cal sheet):

and cut/paste from the PDF:

COEFFICIENTS:
Soc = 3.0636e-004 (adj)
Foffset = -785.65
Tau20 = 1.02
A = -4.3009e-003
B = 1.5565e-004
C = -1.9609e-006
E nominal = 0.036

NOMINAL DYNAMIC COEFFICIENTS
D1 = 1.92634e-4
D2 = -4.64803e-2
H1 = -3.300000e-2
H2 = 5.00000e+3
H3 = 1.45000e+3

F = instrument output (Hz); T = temperature (°C); S = salinity (PSU); K = temperature (°K)
Oxsol(T,S) = oxygen saturation (ml/l); P = pressure (dbar)

Oxygen (ml/l) = Soc * (F + Foffset) * (1.0 + A * T + B * T^2 + C * T^3 ) * Oxsol(T,S) * exp(E * P / K)

It should be obvious that this calculation also requires a function for Oxsol(T,S), which I think is a fairly standard thing, but I'm not an O2 expert so I'll look into that more to see what we're doing ATM.

jklymak commented 3 years ago

In the deployment.yml there is an example of a longitude which requires a conversion function which needs to be supplied in utils.py. However this just does a y = convert(x) type conversion. There is no mechanism for a more complicated calibration at the L0 stage (i.e. one where convert looks up any parameters, or uses other variables in the data stream).

If it were me, I would not do this calibration step at the L0 stage. L0 is just what comes out of the glider in my philosophy, which in this case is raw frequency. If thats what you get I'd say this is what L0 conversion should give you. At the L1 level or higher you can post-process to do calibration. If you want that done early on, you would just add a subroutine to your processDeployment.py driver file that does that step on the NetCDF output. We could even make such a step a utility and it could look up the cals in the yml file.

callumrollo commented 3 years ago

I'm interested in assisting with this. I take it that sensor payload is specified on a per-mission basis in deploymentRealtime.yml? The data from this would then be used by functions like seaexplorer.raw_to_rawnc which is currently hardcoded. Is this correct?

As a first goal, we could at least use the yml to read in sensor configs from the yaml. Calibrations could come later, if these are deemed appropriate for L0 products.

Happy to start hacking on this. Out of interest, how was this yaml setup decided on? Is there a common format it can be expected to follow?

richardsc commented 3 years ago

I'd love to see this make progress, just have not had the time to a) learn the python required, and b) do it myself 😆 .

Happy to help of course, including testing on a variety of different payload configurations.

jklymak commented 3 years ago

The yaml file is just something I made up based loosely on similar files used by the Rutgers group. I think we could readily handle different instruments here

However as noted above I am not terribly in favour of changing the data at the L0 translate stage. If you need to convert raw data to data with units suggest a second stage that just operates on the netcdf timeseries files. Given that I think all you are specifying are the fields supplied by sea explorer and attributes to give to them at the L0 level.

callumrollo commented 3 years ago

Cool. The SeaExplorers at VOTO have their sensors specified in a file called seapayload.cfg which is not well standardised. I'll work to transform this into a yaml like yours, then use that yaml for detecting sensors to be read from the csv files.

I think processing different instruments should be a fairly simple procedure. We'll just need to test it out on lots of payloads to make sure it covers the bases.

jklymak commented 3 years ago

That sounds great. Totally open to major surgery on this part of things - I only had one config to work with, but if we need to break some eggs to make it more flexible, now is the time to do it, rather than when it gets more widely used.

callumrollo commented 2 years ago

All data entered by the user in the glider_devices section of the yaml is now added to the netcdfs as an attribute as of PR #23 . This section would be a good place to put stuff like calibration constants for processing after data have been output by pyglider. I think we're agreed that any conversion should happen outside of the main pyglider flow. Hopefully we can now provide all the data needed to perform such a conversion at a later point.

@richardsc perhaps we could add an example where a user specifies calibration constants in the yaml?

c-proof / pyglider

Dynamically handle payload configurations and sensor calibrations #3