USEPA / EPATADA

This R package can be used to compile and evaluate Water Quality Portal (WQP) data for samples collected from surface water monitoring sites on streams and lakes. It can be used to create applications that support water quality programs and help states, tribes, and other stakeholders efficiently analyze the data.
https://usepa.github.io/EPATADA/
Creative Commons Zero v1.0 Universal
40 stars 18 forks source link

Research underlying issues using the TADA::idCensoredData function "Conflict between Condition and Limit" flag #227

Open cristinamullin opened 1 year ago

cristinamullin commented 1 year ago

Research question: Are the mismatches being found a result of this hierarchy/WQP implementation or are they user submission errors?

Is your feature request related to a problem? Please describe. The idCensoredData function in CensoredDataSuite.R flags rows where detection limit metadata does not match: cens$TADA.CensoredData.Flag = ifelse(cens$TADA.Detection_Type%in%c("Non-Detect","Over-Detect","Other")&cens$TADA.Limit_Type%in%c("Non-Detect","Over-Detect","Other")&!cens$TADA.Detection_Type==cens$TADA.Limit_Type,"Conflict between Condition and Limit",cens$TADA.CensoredData.Flag)

Describe the solution you'd like WQP includes a hierarchy for delivering detection limit data that looks only at Result Detection Limit Type, not the combination of detection limit and detection condition fields.

Hierarchy | Result Detection Limit Type | Upper (+), lower (-), or other (0) limit -- | -- | -- 1 | Practical Quantitation Limit | - 2 | Lower Quantitation Limit | - 3 | Sample-Specific Quantitation Limit | - 4 | Estimated Quantitation Limit | - 5 | Contract Quantitation Limit | - 6 | Minimum Reporting Level | - 7 | Reporting limit | - 8 | Sample-specific min detect conc | - 9 | Laboratory Reporting Level | - 10 | Lower Reporting Limit | - 11 | Sample Detection Limit | - 12 | Lower limit of detection | - 13 | Instrument Detection Level | - 14 | Estimated Detection Level | - 15 | Method Detection Level | - 16 | Measurement Uncertainty | 0 17 | Long Term Method Detection Level | - 18 | Interim Reporting Level | - 19 | Daily detection limit | - 20 | Blank-adjusted method detect limit | - 21 | Contract Detection Limit | - 22 | Upper Quantitation Limit | + 23 | Upper Reporting Limit | + 24 | Upper Calibration Limit | + 25 | Field Holding Time Limit | 0 26 | Laboratory Holding Time Limit | 0 27 | Drinking Water Maximum | 0 28 | Systematic Uncertainty | 0 29 | Statistical Uncertainty | 0 30 | Water Quality Standard or Criteria | 0 31 | Specified in workplan | 0 32 | Taxonomic Loss Threshold | 0

Describe alternatives you've considered If this is a hierarchy WQP implementation issue, one solution may be to update the hierarchy to consider the combination of detection condition and limit fields. Alternatively, WQP can serve all limits in a JSON within the cell. For TADA, if this is a hierarchy issue, we may want to consider joining all detection limits as part of TADAdataRetrieval, and including logic to tell the function to look for if specific combinations if they are available (i.e., implementing the combination of detection condition and limit fields hierarchy within TADA - could then be transferrable to WQP as well after testing).

Alternatively, if this is a user submission error issue, WQX/WQP teams may be able to set requirements/QAQC checks for valid combinations using the logic we have already developed for TADA.

Additional context Add any other context or screenshots about the feature request here. ](https://usepa.sharepoint.com/:w:/r/sites/WaterQualityPortal/_layouts/15/Doc.aspx?sourcedoc=%7B6FA18F0E-8941-4788-AF00-8F83B4D079BB%7D&file=WQXDetectionLimitsBestPracticesGuide.docx&nav=eyJjIjozNjQwOTc0OTJ9&_DSL=1&action=default&mobileredirect=true)

cristinamullin commented 1 year ago

@ehinman Any thoughts on this?

ehinman commented 1 year ago

Definitely something worth evaluating in the data. My hunch is that most of the conflicts with left censored (condition) <--> right-censored (limit type) are user errors, since the right-censored limit types all have higher ranks than the left-censored limit types and would be skipped over if a left-censored limit type was also populated. In situations where it's right-censored (condition) <--> left-censored (limit type), we'd need to do some digging into all the limit types provided....if the condition is right-censored and ANY lower limits were provided, I think those would take precedent as the limit type populated in the result phys chem profile.

We could use TADA to select the most parsimonious condition-limit type combination if multiple limit types are provided. I think in this case, it might be wise to commit to trusting the condition over the limit types provided, and only drilling down so far as to whether they represent left- or right-censored. A tricky example, if the detection condition is "Not Detected" but the limit types provided are "Reporting limit" and "Practical Quantitation Limit", which do you choose? The condition (left-censored) matches both limit types (which describe left-censored limits). Perhaps then we could rely on the hierarchy table posted above to choose the "preferred" limit type?