e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

BLE Summary with specific vehicles #1073

Open JGreenlee opened 6 months ago

JGreenlee commented 6 months ago

The current implementation of ble_sensed_summary on e-mission-server mimics the format of cleaned_section_summary and inferred_section_summary; it looks like this:

{
  "count": {
    "CAR": 1
  },
  "distance": {
    "CAR": 20184.92261045545
  },
  "duration": {
    "CAR": 1772.7775580883026
  }
}

We'd like to know specifically what vehicle it was; instead of just "CAR"; we want to know it was "car_jacks_mazda3". So we talked about having 2 versions of the summary.

{
  "count": {
    "car_jacks_mazda3": 1
  },
  "distance": {
    "car_jacks_mazda3": 20184.92261045545
  },
  "duration": {
    "car_jacks_mazda3": 1772.7775580883026
  }
}

But, if for example we wanted to calculate the carbon footprint based on the car's MPG, we'd still have to cross-reference with the dynamic config to find the vehicle that matches car_jacks_mazda3.

As an alternative, what if we use a different structure that will allow us to have 1 unified summary (an array of modes / "mode summary") ? Then we can include vehicle information in the summary.


[
  {
    "vehicle": {
      "value": "car_jacks_mazda3",
      "bluetooth_major_minor": ["dfc0:fff0"],
      "text": "Jack's Mazda 3",
      "baseMode":"CAR",
      "met_equivalent":"IN_VEHICLE",
      "kgCo2PerKm": 0.16777,
      "vehicle_info": {
        "type": "car",
        "license": "JHK ****",
        "make": "Mazda",
        "model": "3",
        "year": 2014,
        "color": "red",
        "engine": "ICE",
        "mpg": 33
      }
    },
    "count": 1,
    "distance": 20184.92261045545,
    "duration": 1772.7775580883026
  }
]
shankari commented 6 months ago

@JGreenlee interesting. The reason that we had the type of structure was from the "count every trip" project to add uncertainty to the metrics. And the reason the "count every trip" project had that structure, IIRC, was so that we could get a feature (like distance) and see the distribution across modes without having to iterate over sections. So if you wanted to get the primary mode, for example, you could do something like (trip['count'].idxmax()) to get the primary mode.

Having said that, transforming between the structures is not that hard (I think). I would suggest:

JGreenlee commented 6 months ago

If the confirmed trip is a dict, it would have a property ble_modes_summary whose value is an array of objects, each object representing a mode. The object contains 'vehicle' with vehicle info, alongside 'count', 'distance', and 'duration'.

To get the primary mode, we could use the max function on ble_modes_summary with 'distance' (or 'count') as the key.

confirmed_trip = {
  "ble_modes_summary": [
    {
      "vehicle": { 
        "value": "vehicle1",
        ...,
       },
      "count": 1,
      "distance": 800,
    },
    {
      "vehicle": {
        "value": "vehicle2",
      },
      "count": 2,
      "distance": 1300,
    },
  ]
}

primary_mode = max(confirmed_trip['ble_modes_summary'], key=lambda x: x['distance'])
print('primary vehicle is ' + primary_mode['vehicle']['value'])
primary vehicle is vehicle2
shankari commented 6 months ago

ok, I think that there are only a couple more questions before we go ahead with this:

df = json.normalize(confirmed_trips)
df.columns

Will have ble_section_summary.E_CAR in the old method, will have ble_modes_summary.vehicle.baseMode.CAR in the new one, so maybe not a huge deal wrt grouping or other post-processing.

Is your proposal to only change this for the ble modes, or for the cleaned and inferred modes as well? I would prefer to have the same structure for all the *summary entries, although of course that will make the migration take longer. And it would also take more effort to generate the probability distributions above.

@JGreenlee do you have thoughts on what the same structure would look like for the cleaned and inferred section summaries?

JGreenlee commented 5 months ago
{
  "count": {
    "car_jacks_mazda3": 1,
    ...,
  },
  "distance": {
    "car_jacks_mazda3": 20184.92261045545,
    ...,
  },
  "duration": {
    "car_jacks_mazda3": 1772.7775580883026,
    ...,
  },
  "vehicles": {
    "car_jacks_mazda3": {
      "value": "car_jacks_mazda3",
      "bluetooth_major_minor": ["dfc0:fff0"],
      "text": "Jack's Mazda 3",
      "baseMode":"CAR",
      "met_equivalent":"IN_VEHICLE",
      "kgCo2PerKm": 0.16777,
      "vehicle_info": {
        "type": "car",
        "license": "JHK ****",
        "make": "Mazda",
        "model": "3",
        "year": 2014,
        "color": "red",
        "engine": "ICE",
        "mpg": 33
      }
    },
    ...,
  }
}