Open JGreenlee opened 6 months ago
@JGreenlee interesting. The reason that we had the type of structure was from the "count every trip" project to add uncertainty to the metrics. And the reason the "count every trip" project had that structure, IIRC, was so that we could get a feature (like distance
) and see the distribution across modes without having to iterate over sections. So if you wanted to get the primary mode, for example, you could do something like (trip['count'].idxmax()
) to get the primary mode.
Having said that, transforming between the structures is not that hard (I think). I would suggest:
{'vehicle': ...}
, cannot be a keyIf the confirmed trip is a dict, it would have a property ble_modes_summary
whose value is an array of objects, each object representing a mode. The object contains 'vehicle' with vehicle info, alongside 'count', 'distance', and 'duration'.
To get the primary mode, we could use the max
function on ble_modes_summary
with 'distance' (or 'count') as the key.
confirmed_trip = {
"ble_modes_summary": [
{
"vehicle": {
"value": "vehicle1",
...,
},
"count": 1,
"distance": 800,
},
{
"vehicle": {
"value": "vehicle2",
},
"count": 2,
"distance": 1300,
},
]
}
primary_mode = max(confirmed_trip['ble_modes_summary'], key=lambda x: x['distance'])
print('primary vehicle is ' + primary_mode['vehicle']['value'])
primary vehicle is vehicle2
ok, I think that there are only a couple more questions before we go ahead with this:
emission/tests
and on a couple of real dataset snapshots)df = json.normalize(confirmed_trips)
df.columns
Will have ble_section_summary.E_CAR in the old method, will have ble_modes_summary.vehicle.baseMode.CAR in the new one, so maybe not a huge deal wrt grouping or other post-processing.
Is your proposal to only change this for the ble modes, or for the cleaned and inferred modes as well? I would prefer to have the same structure for all the *summary
entries, although of course that will make the migration take longer. And it would also take more effort to generate the probability distributions above.
@JGreenlee do you have thoughts on what the same structure would look like for the cleaned and inferred section summaries?
{
"count": {
"car_jacks_mazda3": 1,
...,
},
"distance": {
"car_jacks_mazda3": 20184.92261045545,
...,
},
"duration": {
"car_jacks_mazda3": 1772.7775580883026,
...,
},
"vehicles": {
"car_jacks_mazda3": {
"value": "car_jacks_mazda3",
"bluetooth_major_minor": ["dfc0:fff0"],
"text": "Jack's Mazda 3",
"baseMode":"CAR",
"met_equivalent":"IN_VEHICLE",
"kgCo2PerKm": 0.16777,
"vehicle_info": {
"type": "car",
"license": "JHK ****",
"make": "Mazda",
"model": "3",
"year": 2014,
"color": "red",
"engine": "ICE",
"mpg": 33
}
},
...,
}
}
The current implementation of
ble_sensed_summary
one-mission-server
mimics the format ofcleaned_section_summary
andinferred_section_summary
; it looks like this:We'd like to know specifically what vehicle it was; instead of just
"CAR"
; we want to know it was"car_jacks_mazda3"
. So we talked about having 2 versions of the summary.But, if for example we wanted to calculate the carbon footprint based on the car's MPG, we'd still have to cross-reference with the dynamic config to find the vehicle that matches
car_jacks_mazda3
.As an alternative, what if we use a different structure that will allow us to have 1 unified summary (an array of modes / "mode summary") ? Then we can include vehicle information in the summary.