osparcomm / HARSAT

Harmonized Regional Seas Assessment Tool
https://osparcomm.github.io/HARSAT/
Other
0 stars 5 forks source link

update treatment of auxiliary variables #479

Closed RobFryer closed 2 months ago

RobFryer commented 2 months ago

478

The checks for valid sex codes for biota should now work for any auxiliary variable. Most variables are just checked against valid ICES codes; imposex and EROD have more specific checks as before. It will be a bigger job to ensure the other biota checks will work for any auxiliary variable. For now I have ensured they will work for WTMEA and AGMEA.

81

A related issue is that 'new' auxiliary variables aren't necessarily merged with the data correctly. Some auxiliary variables are merged by sample and matrix, whereas others are just merged by sample. This was hard-wired and has now been made more flexible by introducing a new control variable auxiliary. This is a list with (currently) just one element by_matrix which takes the default values c("DRYWT%", "LIPIDWT%) for biota and "all" for sediment and water. The values can be modified with the control argument of read_data. Thus, by default, dry weight and lipid weight measurements are matched with chemical concentrations in the same tissue (matrix). but all other auxiliary variables in biota are matched at the sample level. For sediment (and water), all auxiliary variables (e.g. aluminium and organic carbon measurements) are matched with chemical concentrations in the same grain size fraction.
There are still other elements that are hard wired in merge_auxiliary and these will be dealt with in a later pull request.

RobFryer commented 2 months ago

@annelaerkes Please could you review and, if happy, merge.
Please could you rerun the updated code on your data and see if it now works. .But first you will need to covert AGE to AGMEA. If it doesn't, then please send me the data, reference tables, top level script, etc. and I'll have a look. Thanks!

annelaerkes commented 2 months ago

@RobFryer I ran the updated package and get the following error when running create_timeseries(): Error in data.frame(conc, from, to, drywt, drywt_censoring, lipidwt, lipidwt_censoring, : arguments imply differing number of rows: 8599, 1, 0 I will send you scripts and data