fiboa / extensions

A list of extensions for the specification and a guide to create new extensions
Apache License 2.0
0 stars 1 forks source link

Extensions & patterns for Time Series data #19

Open cholmes opened 1 month ago

cholmes commented 1 month ago

There's a number of potential extensions for data in time series, for example:

Each may be worth it's own extension, so we can break them out as we get closer. But one 'meta' question is how we handle / model these in fiboa. Our default way is to just hang things directly off the geometry, like in the same GeoParquet file. But many of these would result in tons of columns, and it's not clear you'd always need the geometry. So it could be good to get a sample dataset of a big time series and figure out how to use fiboa but make it more of a 'reference'. Hopefully the flexibility of not differentiating between collection and feature level attributes introduced in https://github.com/fiboa/specification/pull/39 will help, but it seems like we'd ideally have a way to validate Parquet files that just have a reference to a geometry instead of including it directly.

m-mohr commented 1 month ago

In "old" database days you'd normalize into two tables, e.g.

geo.parquet: id, geometry, area, perimeter time.parquet: id, geo_id, datetime, value1, value2, ...

Wouldn't be in one file, but on the other hand this approach has proven to work well in database world. So I'm wondering whether we should split the files. Is this something tooling can handle easily?

This doesn't cater for geometry changes over time though (unless you create two independant entries in geo.parquet) and/or add another independant identifer that is stable over time.

cholmes commented 2 weeks ago

Yeah, I definitely lean towards two files. I don't think there's tooling for pure parquet that handles this well, but I think we can just make some tooling that helps. I think the main thing is to just work out how to do the references, and how you can validate a fiboa file that has a reference instead of a geometry. And potentially figure out some of the corner cases like geometry changes over time.