noi-techpark / bdp-elaborations

GNU Affero General Public License v3.0
0 stars 4 forks source link

As an AI expert, I would like that the Open Data Hub provides simple traffic elaborations, so that I can use them to automatically check the data quality available and consider it in the automatic computations #29

Closed rcavaliere closed 6 months ago

rcavaliere commented 9 months ago

CISMA and u-hopper would need to have prepared daily statistics of the traffic volumes so to simplify their controls on the data quality of the different traffic stations. This would also be an added value for 3rd parties willing to use more simple and compact elaborated data. The elaborations to be automatically computed should be the following:

Regarding the naming of these new virtual stations: it should just be e.g. SEZIONE DI RILEVAMENTO KM 107.0 - S.FLORIANO (direzione sud) [no reference to the direction of travel]

clezag commented 7 months ago

@rcavaliere Not sure if I understood this right. We have to aggregate each single data type to the period of 1 day, and then also sum up the total number of vehicles, or is only the total needed and not per single vehicle type?

rcavaliere commented 7 months ago

@clezag we need the single totals per vehicle type, and also the total! Just these categories: nr. buses, nr. heavy vehicles and nr. light vehicles (should be the sum of all the records of the day related to these data types). Plus: total_nr_vehicles = nr. buses + nr. heavy vehicles + nr. light vehicles

clezag commented 7 months ago

@rcavaliere everything OK then, just what I implemented. I've got the single data type sums running in testing now, working on the total and combining stations

clezag commented 7 months ago

@rcavaliere For the virtual stations, do they have the same station type as children (TrafficSensor), or do we make them something like CombinedTrafficSensor?

rcavaliere commented 7 months ago

@clezag I would leave the same stationType. We have the parentID field in order to differentiate the different type of stations

clezag commented 7 months ago

@rcavaliere Data is now available in testing. I've done a fast smoke check and it seems to be at least not completely off. I've also enabled the new virtual stations "TrafficDirection" on analytics in testing.

The sum data type I've named "Nr. Vehicles" to be in line with the existing base data type names. As with the station type we can easily change this still, so feel free to give feedback on this.

One thing I'm not sure about is the 'Nr. Equivalent Vehicles' data type. What are these? Will it not be confusing if they are missing in the overall sum?

I've also excluded inactive stations from the elaboration, that's why on analytics many if not most virtual stations and sums seem missing. Again, I can change this quickly if you think it's better to include all stations, even if they are not updated anymore.

rcavaliere commented 7 months ago

@clezag very good work! The type "Nr. Equivalent Vehicles" is a particular elaboration that weights in a different way the heavy vehicles w.r.t. to light vehicles, there is a particular formula to be applied. Let me check with CISMA if we need also this kind of elaboration for such an aggregation. For the rest, it looks everything OK. For inactive stations do you mean "active = FALSE" or did you consider any other critieria?

clezag commented 7 months ago

@rcavaliere yes, I'm just checking for the active flag when requesting stations to elaborate.

In the meantime I've also added metadata for the parent stations

rcavaliere commented 6 months ago

@clezag as separately discussed with CISMA, let's add this additional elaboration related to the Nr. Equivalent Vehicles, calculated as follows Nr. Light Vehicles + 2.5 Nr. Heavy Vehicles + 2.5 Nr. Buses

clezag commented 6 months ago

@rcavaliere already done. I've summed up all the Equivalent Vehicles instead of applying the formula again, but from the checks I've done it looks correct.

I've extracted this data directly from the DB, the last two columns are the total and equiv calculated in SQL as a check check.csv

rcavaliere commented 6 months ago

@clezag perfect! You can release this into production, just ensure that the visibility of the new elaborations are the same as for the traffic stations they refer to

clezag commented 6 months ago

@rcavaliere Released in production