noi-techpark / bdp-commons

GNU Affero General Public License v3.0
2 stars 12 forks source link

As STA I would like to enrich parking metadata so that I can use these additional fields on suedtirolmobil.info #544

Closed rcavaliere closed 1 year ago

rcavaliere commented 2 years ago

STA is not happy about the available metadata of different parking data providers, since they look very different one to each other. A "standardization" of this metadata, in particular the (multilingual) names could be done by STA in their systems for their application purposes but the idea is that these additional metadata is made available to all ODH community. Long term solution is to manage this through the Data Browser. Short term solution is to manage this through a shared Google file, like for Bluetooth metadata: https://docs.google.com/spreadsheets/d/10qBMw28HFWJZen6_CBKT1ln5s8XdygcoC3buXas_8SA/edit?usp=sharing This user story covers the implementation of this short term solution

To do list:

rcavaliere commented 1 year ago

@dulvui let me know if the user story is clear, otherwise tell me if you need more details

dulvui commented 1 year ago

@rcavaliere Should the new names have a own field like "sta_name" or "standard_name" to make it more general?

I think the best solution here would be to create a new Google Spreadsheet dc that only does a station sync and adds the new names to the station. So when the databrowser is online we can simply shut down the dc and don't have to change every parking dc or the writer. What do you think?

rcavaliere commented 1 year ago

@dulvui yes, there should be an additional field in the Google spreadsheet in which an organization like STA can add a "standard" name. I would not change the data we receive yet but simple add additional metadata through this shared file

dulvui commented 1 year ago

@rcavaliere After looking at the writer code i saw that doing this metadata enrichment is not so trivial as I thought. The problem is that the writer creates a new metadata entry in the database every time the metadata changes and so the parking data-collectors would overwrite the enriched metadata created by the new one an vice-versa.

So I would have to change all 3 parking data collectors (5 if ParkingSensors should be changed too) and fetch the enriched metadata from the spreadsheet before syncing the stations with odh.

Should I change all data collectors or do you have a better idea?

dulvui commented 1 year ago

@rcavaliere The data-collectors parking-offstreet-meranobolzano and parking-tn have now the new metadata enrichment functionality running on testing environment. At the moment only parking-tn works, because the parking-offstreet-meranobolzano has problems accessing the merano API endpoint on testing. You can see the the enriched metadata of parking-tn here, currently the values are only test-en, test-de and test-it: https://mobility.api.opendatahub.testingmachine.eu/v2/flat%2Cnode/ParkingStation?limit=200&offset=0&where=sorigin.eq.FBK&shownull=false&distinct=true

I created two sheets one for testing and one for production and inside there is the MeranoBolzano sheet and the Trento sheet. You can simply change the name fields and at the next station sync the values will be updated

rcavaliere commented 1 year ago

@dulvui nice progress! I created an additional column in the testing sheet, but it has not been included since I don't see the additional field in the data we expose through our API. How does the update progress work? Which frequency did you set for the update of this metadata?

dulvui commented 1 year ago

@rcavaliere The frequency depends on the parking data-collector itself. Parking merano-bolzano is set to once an hour and trento to once a day, but I can increase the frequency in case easily.

The fields that get synchronized are defined in the config of the data collectors, so at the moment if you add a column I have to add the column name to the config manually. I could make this dynamic, so that it simply matches always all columns. Now I added you new column to the config and at the next station sync at midnight tonight it gets synced.

rcavaliere commented 1 year ago

@dulvui ok clear, I would suggest to make this dynamic so we don't need any manual configuration

dulvui commented 1 year ago

@rcavaliere The enriched metadata creation is now dynamic and when adding/removing a column the metadata gets automatically adapted on the next station sync.

rcavaliere commented 1 year ago

@dulvui excellent! I see this working for the origin = FBK ParkingStations. I have tried to this also on the Merano-Bolzano, but it seems not update, e.g.: https://mobility.api.opendatahub.testingmachine.eu/v2/flat%2Cnode/ParkingStation?limit=200&offset=0&where=sorigin.eq.FAMAS&shownull=false&distinct=true

dulvui commented 1 year ago

@rcavaliere Yes there is still the problem with the Merano Parking endpoint on our docker testing machines. I will try today to make it work somehow.

rcavaliere commented 1 year ago

@dulvui ok, now I checked in the testing environment, works fine. We can go to production

dulvui commented 1 year ago

@rcavaliere Now its running in production. The stations synchronization is set to once every hour.