USEPA / EPATADA

This R package can be used to compile and evaluate Water Quality Portal (WQP) data for samples collected from surface water monitoring sites on streams and lakes. It can be used to create applications that support water quality programs and help states, tribes, and other stakeholders efficiently analyze the data.
https://usepa.github.io/EPATADA/
Creative Commons Zero v1.0 Universal
39 stars 18 forks source link

Unit conversions - mass/mass vs mass/volume #505

Open hillarymarler opened 1 month ago

hillarymarler commented 1 month ago

Found in closes issue#33 (Review HarmonizeData Functionality)

There are also a few more conversions that could happen at the units level. For example: A common dimensionality conversion is mass/mass vs mass/volume, we can deal with it using constants for fresh water (not adjusted for salinity). We may only want to do this conversion on certain waterbody types (freshwater), or we can accept a small level of imprecision for these conversions and apply it to all waterbody types. It is generally not going to make a big difference.

For example, Convert all UG/KG = UG/L (1 to 1) for media type water

Originally posted by @cristinamullin in https://github.com/USEPA/EPATADA/issues/33#issuecomment-1097207934

wokenny13 commented 1 month ago

Are there other examples of conversions that should be considered?

Is the conversion for salinity adjustment something that would be of interest to convert?

hillarymarler commented 1 month ago

I don't have any experience with salinity adjustment - might be a good topic for next team meeting - whether it should be incorporated in this function or not.

I know I have seen some other examples (mass/mass to mass/volume) when I was testing TADA_CreateUnitRef but don't recall the units. Running TADA_CreateUnitRef on a a few random data sets should help generate a list. I can do that or you can, what is your preference?

wokenny13 commented 1 month ago

I can work on running some random data sets. I think it would be good practice for me to run RandomTestingData and familiar myself with TADA_CreateUnitRef and harmonize data functionalities

hillarymarler commented 1 month ago

Sounds like a good plan! I think the Shepherdstown example data may also contain some of the mass/mass units.

wokenny13 commented 1 month ago

Are we looking to convert all mediaType = Freshwater, where ResultMeasure.measureUnitCode is expressed as ug/kg, to UG/L in the TADA.Target.ResultMeasure.MeasureUnitCode?

Are we looking to convert any other units? If a ResultMeasure.measureUnitCode is any mass/mass such as mg/kg, ng/g etc, is it supposed to get converted to a UG/KG unit in all cases? Or is this not always the case and sometimes a ResultMeasure.measureUnitCode unit isn't being referenced and can't be converted? Is the goal to find cases when a mass/mass isn't converted to UG/KG but needs to be converted to a mass/volume? And then convert those to UG/KG, and from there, convert those UG/KG units for freshwater to UG/L? image

wokenny13 commented 1 month ago

Is there any sort of indicator or ref table on which waterbody types would tie to freshwater vs saltwater?

hillarymarler commented 1 month ago

Are we looking to convert all mediaType = Freshwater, where ResultMeasure.measureUnitCode is expressed as ug/kg, to UG/L in the TADA.Target.ResultMeasure.MeasureUnitCode?

Are we looking to convert any other units? If a ResultMeasure.measureUnitCode is any mass/mass such as mg/kg, ng/g etc, is it supposed to get converted to a UG/KG unit in all cases? Or is this not always the case and sometimes a ResultMeasure.measureUnitCode unit isn't being referenced and can't be converted? Is the goal to find cases when a mass/mass isn't converted to UG/KG but needs to be converted to a mass/volume? And then convert those to UG/KG, and from there, convert those UG/KG units for freshwater to UG/L? image

The goal would be to convert results to the target unit for each parameter (where possible). So if the user specified unit is MG/L, but the unit in the TADA df is UG/KG, the result should be converted to MG/L and the TADA.ResultMeasure.MeasureUnitCode updated accordingly. Right now, the conversion factors between the mass/mass and mass/volume aren't included in the ref files where the conversion factors and coefficient come from. (Or maybe some are, but not all - I haven't looked at in depth).

Is there any sort of indicator or ref table on which waterbody types would tie to freshwater vs saltwater?

I don't know of one, but this is a good question for someone on the WQX team. Maybe @cefergus knows?

I know from previous WQX conversations that not all orgs are consistent with selecting different values for waterbody types (they may code all locations the same way), so we might to think about whether there should be an option for users to select whether they want results adjusted for salinity or not. This would also require a pairing function to find salinity results that matched the result requiring conversion. I've got a draft of a pairing function started (using hardness/temp/ph) is the module 3 branch. So maybe it makes sense to create a first draft that does not incorporate salinity and then once the initial conversion steps are worked out, figure out how to incorporate the paired results/salinity piece.

wokenny13 commented 1 month ago

What are your thoughts on where the conversion factors between mass/mass and mass/volume should take place?

Are we looking to include additional rows in the WQXunitRef for this? Would this result in a one to many match?

Or are we looking to write an if statement in all cases when if(UG/KG - or mass/mass in general - for freshwater, then convert to UG/L)

Would we want to write some code to handle any mass/mass conversion, not found in WQXunitRef, to UG/KG then convert to UG/L when appropriate? So for example if there is a parameter that gets expressed by an organization as UG/MG, which doesn't seem to have a conversion in the WQXunitRef table, be able to handle the conversion of that UG/MG, to UG/KG, then to UG/L if appropriate?

hillarymarler commented 1 month ago

What are your thoughts on where the conversion factors between mass/mass and mass/volume should take place?

In TADA_ConvertResultUnits

Are we looking to include additional rows in the WQXunitRef for this? Would this result in a one to many match?

We have additional rows for TADA-specific unit conversions in EPATADA\inst\extdata\TADAPriorityCharConvertRef.csv. So additional unit conversions can be added.

Or are we looking to write an if statement in all cases when if(UG/KG - or mass/mass in general - for freshwater, then convert to UG/L)

I am not sure. We can talk this over with the whole team.

Would we want to write some code to handle any mass/mass conversion, not found in WQXunitRef, to UG/KG then convert to UG/L when appropriate? So for example if there is a parameter that gets expressed by an organization as UG/MG, which doesn't seem to have a conversion in the WQXunitRef table, be able to handle the conversion of that UG/MG, to UG/KG, then to UG/L if appropriate?

We could add those rows to EPATADA\inst\extdata\TADAPriorityCharConvertRef.csv. Would that work?

cefergus commented 4 weeks ago

I'm late to the game but catching up.

Is there any sort of indicator or ref table on which waterbody types would tie to freshwater vs saltwater?

Hmm, the only WQX domain table that I'm aware of that might have that information is the MonitoringLocationType. But I think Hillary is correct that organizations may not be consistent with entering that information. We could double check with Adam though. It also looks like there are many categories (e.g., Canal Drainage, Estuary, Lake, Intertidal) and we would need to decide which ones belong in "freshwater" vs "marine" or whatever the classes would be. https://www.epa.gov/waterdata/storage-and-retrieval-and-water-quality-exchange-domain-services-and-downloads#domain

wokenny13 commented 4 weeks ago

Discussed in 8/16 TADA team meeting - Should we allow an argument for user - TRUE/FALSE - on whether mass/mass to mass/volume should be converted?

Include this in TADA_ConvertResultUnits?

hillarymarler commented 3 weeks ago

Maybe that param should be included in both TADA_CreateUnitRef and TADA_ConvertResultUnits? Because if the user does not supply their own ref, then TADA_CreateUnitRef is used in TADA_ConvertResultUnits to creat the ref.