noi-techpark / bdp-core

Open Data Hub / Timeseries Core
https://opendatahub.com
Other
9 stars 4 forks source link

As an Open Data Hub Mobility I would like that also the fields contained in the station table are stored in the metadata table, so that we can keep track of changes in the stations' metadata #269

Closed rcavaliere closed 11 months ago

rcavaliere commented 1 year ago

We have the following use case to be addressed. For the Data Collector https://github.com/noi-techpark/bdp-commons/tree/main/data-collectors/environment-a22 (air quality sensors of A22), it is planned to periodically change the positions of sensors. This is due to the need to periodically calibrate the low cost sensors used. These sensors will therefore have a different positions in time. Since it is important to know exactly what was measured in which point, it is important that we store all the relevant metadata information, so that we can let it match with the measurements collected

230313_IntercalibrazioneSensori_1

rcavaliere commented 1 year ago

Results of the meeting of @clezag today: we change paradigm and consider each "physical" station a station in our data model. The current stationcode, which is the ID of the "mobile" sensor becomes part of the metadata set. In this way, a data scientist can easily check the time series related to a certain monitoring points, even if monitored by different monitoring units (which can be checked from the metadata record).

clezag commented 1 year ago

@rcavaliere I've done some work on the core this week, and implemented a simple form of this issue. I've not put it into testing however, as it will affect the actual metadata in the database.

My current (naive) solution would be to have a top level "station" object within the metadata, like this:

{
    "station": {
        "code": xxx,
        "name": xxx,
        "origin": xxx,
        "type": xxx,
        "active": xxx,
        "available": xxx,
        "coordinate": {
            "x": 1234.123,
            "y": 456.456,
            "srid": 4125
        }
    },

    "other metadata field": xxx
    ...
}

This would mirror exactly what the station API returns today (minus the "s"-prefix)

The problems I see with this, is that it will be included in every single request of the station from now on. In all cases except metadata history calls, it will just be duplicate information, as you will see something like:

"scode": 
"sname":
"origin":
...
"smetadata":{
    "station": {
        "code": xxx,
        "name": xxx,
        "origin": xxx
        ...
    }
}

Was your intention to do display this only for metadata history calls or always?

rcavaliere commented 1 year ago

@clezag yes, this was actually the idea. I understand your point but when making a request to the API one can eventually through filter avoid to have this double fields. This feature would be actually much useful in case of retrieval of the historical metadata, since at present we do not make a history of the fields in the station table

clezag commented 11 months ago

Since we will not be "moving" the stations, but instead tracking the sensor ID in the metadata, there is no need to track changes to station tables anymore. For this specific use case, just having the metadata history exposed should be enough.

For future reference, there is a (largely untested) implementation of station history (replacing metadata history) in the two feature branches:

As of now, these branches will not be merged into main or developed any further.