iobis / env-data

ENV-DATA related issues and documentation
2 stars 0 forks source link

OBIS-ENV-DATA reviewers #5

Open albenson-usgs opened 7 years ago

albenson-usgs commented 7 years ago

As nodes move towards using OBIS-ENV-DATA as their new standard, will there be people who can review the outputs to make sure they conform to the standard correctly?

albenson-usgs commented 7 years ago

For instance, I just finished converting a dataset that is already in OBIS to Event Core with eMoF. Is there someone that can look at it for me and make sure it looks ok?

https://www1.usgs.gov/obis-usa/ipt/resource?r=noaa_coralreefmonitoring_lpipercentcover

Daphnisd commented 7 years ago

I looked at the dataset in some detail, but I’m not sure I fully understand the dataset (the methodology section is a bit vague). The sampling effort says 25 x 4 m transect for 15 min, and some sort of code in front which I don’t understand. GeoreferenceProtocol says there is a 50X50 meter sampling frame. So I conclude that periodically in each sampling frame a transect was conducted to estimate the macroalgae cover and that each event in your dataset represents a full transect.

Sampling effort and protocol would go to the eMoF. It will need some thinking to define all characteristics to necessary to document the methodology.

I think you can have parameters like:

You have some non-species names in the dataset recorded as occurrences (Bare Substrate, MacroOtherCalcareous, MacroOtherFleshy, Other, Turf Algae Free of Sediment, Turf Algae with Sediment). You want to specify whether in the transect and also the percentage cover. I think we need discussion on how to deal with those. Possibly we could create different parameters like “percentage cover of Bare Substrate” in the eMoF linked to the event. Thinks like “Turf Algae Free of Sediment” can possible stay as an occurrence; it’s biotic information and the record can be linked to WoRMS (to biota). In this case we’ll still need discussion on what to put as scientific name.

About the eMoF, I wonder is there is any point in adding 0 % values? If you have occurrencestatus = absent the coverage value in the eMoF should be 0. If this is not the case, something is wrong. Omitting these 0’s might help in keeping your file size low? You will also need to assign measurementTypeID’s and measurementUnitID’s. I would suggest for percentage cover: http://vocab.nerc.ac.uk/collection/P01/current/SDBIOL10/ and http://vocab.nerc.ac.uk/collection/P06/current/UPCT/. I don’t think Percent/100m^2 is a valid unit, so I don’t think we’ll manage to add it in a controlled vocabulary. The /100m^2 part can be stored separately in the “Area sampled of the bed” parameter.

You’ll also need a P01 for the parameter rugosity. As it is a value but doesn’t have a unit you could use the P01 http://vocab.nerc.ac.uk/collection/P06/current/UUUU/

albenson-usgs commented 7 years ago

Thank you for reviewing this for me Daphnis. I appreciate it.

I agree the information preceding the transect information in samplingEffort isn't very helpful. I do think it's important to combine what's in that field with what's in samplingProtocol which should help to give a clearer picture of what the methodology was. This would be further enhanced if we could link to a protocol database. It's something we are working on in USGS but it's not easy. This is what we have so far for that particular method https://www.monitoringresources.org/Document/Method/Details/5495.

I haven't used the NERC vocabulary server because I don't understand how to work with it. Is there training or documentation somewhere?

I agree that it's slightly redundant to have occurrenceStatus = absent and eMoF = 0 but I think it's worth keeping for now since including absence data is not yet a routine occurrence.