iobis / obis-issues

Repository for all OBIS related issues and feature requests
5 stars 3 forks source link

emof difficult to work with for dataset visualizations #169

Open albenson-usgs opened 4 years ago

albenson-usgs commented 4 years ago

Luc Mehl at Axiom has shared some pain points with me about trying to visualize data structured in the OBIS-ENV-DATA format- specifically issues with the emof table. For the MBON Portal, Axiom was requested to build a visualization for the Puerto Rico Long-Term Coral Reef Monitoring Program Database Compilation. For their purposes they need everything in a flat file.

From Luc: I think the horizontal vs. vertical issue is just about accessing the data. Vertical is fine for a database, write some SQL, get what you want, churn. But browsers, javascript, need flat relationships... horizontal. The logic is more complicated in a vertical structure: Is there a weight measurement type? --> Loop through every row to see if it is the weight type. --> Extract the value if it is. --> Use the occurrenceID to associate the value with the occurrence --> Use the eventID to associate the occurrence with an event. --> Show on map.

The MoF table allows the provider to enter any measurement type, any measurement value. This generic solution provides a great container, but it isn't easy to automatically read data from that container. What would work for direct machine ingest/visualization/analysis, would be to create a parameter for every possible measurement type (a horizontal structure with individualCount, individualWeight, etc.), but that defeats the generic convenience of MoF's vertical structure.

pieterprovoost commented 4 years ago

Thanks Abby. I'm not sure what to make of this. What does their data flow look like? Are they getting the data from IPT, from our API, through R? Unless they are fetching data from the API directly into their frontend (which is probably a bad idea), there is going to be some data transformation anyway. So why not just convert it to the form they need it in at that stage? Event core with MoF is structured like a relational database, this is just an efficient way to store data and should readily transform into a flat table with a few joins. There's certainly no need to do costly lookups on the fly like outlined above.

I'm happy to help you out, but I'm not sure what I can do. If there are specific API endpoints we can provide to aid the visualization then please let me know. Note that the occurrence API already provides a format which is rather flat, the occurrence records and all higher level events are already flattened and only the MoF records are nested. See this call for example, this can be made into a web map with just a few lines of JavaScript, no joining or matching required. Of course, there are some limitations to the number of records that can be fetched in one go, as with any API.