TURB max value too low.

aodn / imos-toolbox

Graphical tool for QC'ing and NetCDF'ing oceanographic datasets

GNU General Public License v3.0

46 stars 31 forks source link

TURB max value too low. #632

Open sspagnol opened 4 years ago

sspagnol commented 4 years ago

Just to highlight that I think the TURB max value in imosParameters.txt (listed below) are too low.

TURB,                1, sea_water_turbidity,                                                                      1,             ,              ,                                  U, 999999.0, 0.0,      4.0,      float
TURBF,               0, sea_water_turbidity_in_FTU,                                                               1,             ,              ,                                  U, 999999.0, 0.0,      4.0,      float

Might be a workflow issue on my part? When I loaded a dataset with TURB with values up to 1000 NTU, imosToolbox displayed full range of data, but on export it sets the max to 4.0.

ocehugo commented 4 years ago

@sspangol, so there are two issues:

In your opinion, the maximum allowed value for TURB should be increased.
When opening the GUI, the limits in the IMOS_parameters file are not obeyed.

(1) I think if you have a use case where the TURB>4.0 are still valid is a strong suggestion for increasing this global limit. However, to increase the TURB value above 4 would need a wider discussion since this value was agreed some time ago.

(2) This is expected behaviour - show RAW data first. To clip the data with the limits described in imosParameters.txt you have to QC-first (click on qc and select imosGlobalRangeQC). Let me know if the imosGlobalRangeQC is not being applied to TURB (it should).

PS: You can indeed change the limits in the imosParamters.txt file but this would not be IMOS compliant/standards.

petejan commented 4 years ago

Most of the FLNTUS instruments we use on SOTS moorings have a measurement range of 0-25 NTU. In the QC report we selected a range of 0-2000 AD counts, which is 12.5 in our case, we considered this well above anything we would expect out SOTS.

If you think measurements above 4 NTU are valid for your site, then the limit should be increased. Having a 'standard' which is below what we're measuring in the ocean does not sound like a good idea to me, much like the NOAA ozone hole measurements.

ocehugo commented 4 years ago

my point here is: If we need to change a global limit, better to discuss to what limit it should be.

Most of the FLNTUS instruments we use on SOTS moorings have a measurement range of 0-25 NTU. In the QC report we selected a range of 0-2000 AD counts, which is 12.5 in our case, we considered this well above anything we would expect out SOTS.

One more reason to increase the current limit - 2 user cases where the current limit is not appropriate. But again, should we put the limit to 25 ,to 12.5, etc?

If you think measurements above 4 NTU are valid for your site, then the limit should be increased.

I agree, but this is the kind of thing that needs to be communicated so we can update the global limits, otherwise, why use global limits?

Having a 'standard' which is below what we're measuring in the ocean does not sound like a good idea to me, much like the NOAA ozone hole measurements.

If the current limit is not enough, we update it. We just need to talk to each other to change to a new value that is good for everybody so we avoid recurrent changes.

petejan commented 4 years ago

I would set the global limit to the instrument range (or just below it), on SOTS we decided that it was just as good to use 12.5 here as 25.

Why use a global limit, because that the limit where the instrument (in the Wet-LABS FLNTUS) is no longer recording correct data, if the instrument records 20 NTU and the data provider thinks that this is a valid ocean value then 20 NTU should be reported.

ocehugo commented 4 years ago

I would set the global limit to the instrument range (or just below it)

I agree. However, the clipping is global, location and instrument wise, because it's at a variable level (TURB). At best, it should be slightly below the nominal range of all isntruments and all regions max(max_realistic_value_of_all_facilities_and_all_regions). At worse, it should be max(max_nominal_value_of_all_insturments)

I would not recommend changing imosParameters.txt to do regional clipping or to limit ranges in cherry-picked deployments. It's not traceable, not reproducible, change the netCDF valid_min/max attributes, and create a merge conflict for new updates. The RegionalRangeQC machinery is here for this purpose.

petejan commented 4 years ago

For me this is the trap that NOAA fell into with Ozone measurements.

If you instrument can measure from 0 to 25 NTU, then this should be the range (instrument based range). If you have a reading of say 20 (above the current global range of 4) and the senor functioned correctly, then what should we say about the ocean turbidity?

Two other points,

zero (0) is a difficult cut off, as due to noise the value can be small and +ve or -ve, so a cut off of zero ends up biasing the average.
Does the global range end up setting the netCDF attributes valid_max and valid_min? These should be set to the instrument range not the expected ocean range. values outside the expected ocean range could maybe flagged. If the valid_max/valid_min most tools will substitute the fill_value outside this range.

Some of the issue is that there is no instrument level ranges for parameters, so one ends up using the global range.

ocehugo commented 4 years ago

I just checked, and TURB is not clipped regionally anywhere. Moreover, and we are probably only using FLNTUS from SBE ( all files at AODN containing TURB variable got or some SBE19 or Wet labs entries as instruments). It's not straight forward to correctly track the sensor used since we only track the top-level instrument name.

Is anyone aware of other sensors across IMOS for TURB (in the past or in the roadmap)? If not, I may reference this discussion to increase the limit to 25.

If you instrument can measure from 0 to 25 NTU, then this should be the range (instrument based range). If you have a reading of say 20 (above the current global range of 4) and the senor functioned correctly, then what should we say about the ocean turbidity?

Yeap, IMO the best solution is clipping by instrument model range, followed by a regional range. However, this is not the current state of affairs which is clipping to a pre-defined range (defined by group/someone) at the top of the QC execution by imosGlobalRangeQC.m. The existing codebase is not ready for clipping by instrument model, however. For that, we need new code + a verified db of instrument makers, names and nominal ranges for all variables. We should add this together with the uncertainties issue.

Two other points,

zero (0) is a difficult cut off, as due to noise the value can be small and +ve >or -ve, so a cut off of zero ends up biasing the average.

Not applicable. The idea is that any correction on the data is done at PreProcessing step. The QC is done afterwards so the data is assumed in the valid range of the variable.

Does the global range end up setting the netCDF attributes valid_max and >valid_min? These should be set to the instrument range not the expected ocean range. >values outside the expected ocean range could maybe flagged. If the >valid_max/valid_min most tools will substitute the fill_value outside this range.

The imosParameters.txt sets the valid_min/max attributes for all variables. Hence, the valid_min/max are "static" and not "dynamic". I like the idea of having a "dynamic" min/max, but I'm not sure if all software out there will like this, particularly regarding aggregation of netcdf or visualisation tools.

petejan commented 4 years ago

There are only 3 instruments that measure TURB that I know of, all the CSIRO deployed ones are I know of are based on a WetLABs FLNTUS or FLBB2 (the WetLABS WQM uses the same hardware as a FLNTUS, and the SBE19 has an externally connected FLNTUS, but logs the data internally)

So the ranges applied to data are,

netCDF valid maximum/minimum (which takes its values from imosParameters.txt)
Global Range (which applies the value from imosParameters.txt)
Regional Range (from imosRegionalRangeQC.txt)

Missing is

an per instrument range

I would suggest, netCDF range should be larger than anything ever possible by all instruments ever (-1 to 100 maybe good for TURB for instance) as different tools handle these differently downstream, and some tools delete data outside this range. Global range, regional range should only flag the data good/../bad. The instrument range should come from the file parser and manufacturer and also flag the data as bad on parsing, may have to think about how best this should be implemented as RAW data coming from the parser is not flagged at this stage.

The TURB and CPHL ranges were changed by,

https://github.com/aodn/imos-toolbox/commit/fe333fc07e856eb4139d440bb77e92afea48850c#diff-433f598e61e7d38a65e9d9bceb8cf026

Which is probably why @sspagnol raised the issue.

ocehugo commented 4 years ago

and some tools delete data outside this range.

Exactly - valid_[min, max, range] is standard to clip ranges.

We should even drop a global limit if we got a compulsory instrument(and/or regional) clip at PP/QC level.

For example, and generically, we can't predict the range will be within a certain range for RAW data (say [-1,100]). Maybe the data needs some recentering (raw range is [-10,10]). A global limit applied earlier -> bias is generated. Hence, for RAW data, the correct way is to not even use the attr (or to use dynamical ones based on the actual range).

When doing PP/QC, the limit should be applied in the order Instrument->regional limits since the recentering/pre-processing corrects the variable for valid ranges. Since the instrument/regional is bounded below anyway, a top-level global limit is not even necessary (or the instrument limit is promoted to be global).

sspagnol commented 4 years ago

We should even drop a global limit if we got a compulsory instrument(and/or regional) clip at PP/QC level.

If I understand correctly agree dropping of global limits QC, so no clipping on import using imosParameters.txt. So what gets is applied when you click 'qc data' button?

I would suggest, netCDF range should be larger than anything ever possible by all instruments ever (-1 to 100 maybe good for TURB for instance) We do have 1000 NTU sensor on NRSDAR to capture few occasions that have data at 100-200 NTU level, mostly anything above 400 NTU is spikey stuff.

The instrument range should come from the file parser and manufacturer and also flag the data as bad on parsing, may have to think about how best this should be implemented as RAW data coming from the parser is not flagged at this stage.

Just checking you are thinking of something like :

For Wetlabs ECO NTU, can use dev file to calculate max NTU value = (16380- offset)*scale [this needs to be checked as never sure what setting instrument in ASV1/2/4 does], then maybe round up to nearest 10?

For TURB in an SBE cnv file, if it has the xml header included, then search for things like

#   <sensor Channel="5" >
#     <!-- A/D voltage 1, Turbidity Meter, WET Labs, ECO-NTU -->
#     <TurbidityMeter SensorID="67" >
#       <SerialNumber>NTUS-438</SerialNumber>
#       <CalibrationDate>04/01/17</CalibrationDate>
#       <ScaleFactor>188.459000</ScaleFactor>
#       <!-- Dark output -->
#       <DarkVoltage>0.037000</DarkVoltage>
#     </TurbidityMeter>
#   </sensor>

and max NTU value = (5.0 - offset)*scale.

And if parsers cannot provide max NTU value, use imosParameters.txt (TURB min=0, max=1000) as netcdf valid_min and valid_max?

ocehugo commented 4 years ago

If I understand correctly agree dropping of global limits QC, so no clipping on import using imosParameters.txt. So what gets is applied when you click 'qc data' button?

FYI - It's just an idea ATM. If clicking on 'qc data' button and running defaults, you run imosGlobalRangeQC. The idea above is to have a substitute called imosInstrumentRangeQC that will clip based on instrument range. The only thing I'm skeptical here is that, IMO, clipping to instrument range should be a manufacturer software problem, not ours. Hence, it may be limited usability in the long term.

By using a InstrumentRange table, we are being more granular and solving 2 problems: a. "the global range is not wide enough for my instrument/region" and b. the impact of updating particular instruments ranges is more contained.

PS: This can be very simple in the beginning and fallback to imosParameters limits in any non-handled case.

For Wetlabs ECO NTU, can use dev file to calculate max NTU value = (16380- offset)*scale [this needs to be checked as never sure what setting instrument in ASV1/2/4 does], then maybe round up to nearest 10?

You are touching a related but different problem - the fact that some limits, coeff. or scales to fix/adjust the data are present within the instrument files. Ideally, we need to obtain all the information from the file themselves, but this is not the case ATM. Even more, sometimes the data in the instrument may be outdated and/or another coef/adjustment is required.

In regard to Instrument Limits, the idea is a file with limits from specsheets of instrument makers (as for uncertainties). We need to do something similar for uncertainties, so maybe its worth it doing instrument ranges too. There is also a lot of .txt files around with limts, coeff, etc that would be nice to have in a centralized file organized by instrument.