e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

Change dashboard to support user inputs #688

Closed shankari closed 2 years ago

shankari commented 2 years ago

Halting work on https://github.com/e-mission/e-mission-docs/issues/680 to make a higher priority change. Feedback from the CanBikeCO program admins was that the dashboard would be more impactful than the current polar bear gamification in motivating change.

However, in order to use the dashboard for CanBikeCo, we need to actually take the user labels into account. This should handle some of the issues reported with https://github.com/e-mission/e-mission-docs/issues/476, notably https://github.com/e-mission/e-mission-docs/issues/476#issuecomment-860470624 which should allow La Rochelle to reintroduce the user labeling and the common trips!! (@PatGendre)

shankari commented 2 years ago

Recap of existing design and future design considerations on the phone:

NOTE: If the deployer supports user inputs but the user hasn't labeled anything, we should make sure to return one N/A bucket from the server for this to work.

shankari commented 2 years ago

Server design considerations

Let's start with the server to see if the rest of this is feasible.

The metrics code currently is already configurable wrt the key that it reads for the analysis results.

    section_df = esda.get_data_df(eac.get_section_key_for_analysis_results(),

However, it assumes the retrieved entries are sections and that we need to use the sensed_mode field from the section for the grouping.

        mode_grouped_df = section_group_df.groupby('sensed_mode')

Since we are never going to call this new field sensed_mode, one obvious change is to make the grouping field also configurable.

But what kinds of entries should we retrieve and how should we specify the sensed mode?

  1. One potential fix is to create a confirmed_section for each confirmed_trip that has the confirmed mode user input.
  2. Another would be to continue consuming confirmed_trip, to expand the user inputs using expand_userinputs in emission//storage/decorations/trip_queries.py

(2) seems like an easier fix given that then we can kick the can of "how to create a confirmed section", down the road a bit more and not deal with it while making a high-priority fix

shankari commented 2 years ago

@PatGendre @jf87 any comments or thoughts on the design? I'm going to implement server changes first and the phone changes in the rest of the week.

PatGendre commented 2 years ago

@shankari this will be a great feature.

Actually in la Rochelle they are develop a separate "coachCO2" app hosted in the cozy cloud personal cloud service, and the e-mission "tracemob" is only a data collection tool (and possibly could be completed with other data collection tools such as train ticket sales, for instance). So the dashboard will be in "coachCO2" not in "tracemob", still the feature will be interested for tracemob as we could have other use cases than la Rochelle.
The coachCO2 app is currently being develop for coachCO2, one iteration every 6 weeks or so. What is intended is that the user could label the mode at the section level (and can also take into account labeled trips from e-mission), and the CO2/Calories metrics will be computed in the coachCO2 app. The code is here, https://github.com/cozy/coachCO2/blob/master/src/constants/const.js As the car model has a major influence on CO2 emission, it is intended to enable the user to parameter the car model in the app.
Also, it is intended to take into account the elevation if we manage to get the z data.
I write "intended" as the development is iterative and the priorities are not fixed in advance...

As for your questions,

shankari commented 2 years ago

@PatGendre both the features that you have outlined above:

would be very interesting to the core. Is there any hope of contributing them back?

I do raise the custom modes question above. For now, I am going to ignore them. But later, when we design an "other-handling" system, we should have a process for somebody to find the CO2 equivalent for new modes and enter them.

PatGendre commented 2 years ago

@shankari I am not sure it can be contributed back as the app is in react, not cordova (but I don't much about mobile dev ;-). The section level labelling will just add an attribute so won't be a true trip editing feature (which could also e.g. merge 2 sections into one, etc.). Anyway it will be open source and maybe it can nurture interesting discussions about functionalities?

I do raise the custom modes question above. For now, I am going to ignore them.

Ok, this is what I understood, and it seems reasonable! I don't know yet how custom modes will be handled in coachCO2 in future iterations but I can keep you informed of course.

shankari commented 2 years ago

@PatGendre for a cordova app, the UI is in javascript. Any javascript. Most of our UI is currently in Angular/Ionic, but we can interoperate with (basic) React components using ngReact. we experimented with this as part of the Itinerum integration and it looks like @kafitz got it to work https://github.com/e-mission/e-mission-docs/issues/643#issuecomment-903133273

So you/they could theoretically integrate the same codebase into the existing phone UI as well...

shankari commented 2 years ago

Server changes done! Remember to configure conf/analysis/debug.conf.json to set the analysis.result.section.key to analysis/confirmed_trip if you want to use this feature!

Phone changes next; ETA early next week.

jf87 commented 2 years ago

Sorry to join this discussion a bit late. I could not follow everything completely you wrote @shankari, but also in alignment with @PatGendre I would prefer a simple solution. I am happy to have a chat if it makes sense, just ping me :-)

shankari commented 2 years ago

Started work on the phone side; found that the median_speed metric doesn't work with confirmed trips. Fixing that first.

shankari commented 2 years ago

We need to generalize the CA 2030 and 2050 goals now that we are no longer focused on California. Ideally we would use a global number, but the NDC and population estimates globally are more complicated. Let's start with the US and move to global soon.

I'm recording the calculations here for the record. For these calculations, we will rely largely on the US NDC instead of following up on primary sources.

Current US goals, per "The Long Term Strategy of the United States" are:

US population estimates and projections are from the International Population Database (https://www.census.gov/programs-surveys/international-programs/about/idb.html). The "Tables" view is particularly useful to obtain the granular estimates that we need.

We estimate the per capita values using:

per capita kg CO2e/wk = (1,000,000,000 * 1000 * GTCO2e) / (population * 52)
Time Yearly GT CO2e Population Per Capita kgCO2e/wk
Baseline (2005) 2 295,516,599 (1000000000 1000 2) / (295516599 * 52) = 130
Short-term goal (2030) 1 355,100,730 (1000000000 1000 1) / (355,100,730 * 52) = 54
Long-term goal (2050) 0.3 388,922,201 (1000000000 1000 0.3) / (388922201 * 52) = 14
shankari commented 2 years ago

Tried to push this to staging and ran into several problems:

shankari commented 2 years ago

Feedback from Sandee: Change to "US goals"

shankari commented 2 years ago

Change colors: Green + Thumbs up if you meet the goal, Yellow + Sweating if you don't

shankari commented 2 years ago

Picking up where this left off...

shankari commented 2 years ago

The hack to find the median_speed https://github.com/e-mission/e-mission-server/pull/843 makes things very slow, since it has to make multiple database queries for each trip. One potential workaround is to use the mean speed instead, since we have distance and duration for both sections and trips.

Looking back at the calorie calculations https://github.com/e-mission/e-mission-docs/issues/139, we use METs from the https://sites.google.com/site/compendiumofphysicalactivities/home

The compendia for bicycling, for example, have values similar to "bicycling, 10-11.9 mph, leisure, slow, light effort". There is no indication of whether the speed is median or mean. Let's go with mean since that is easier from a code perspective.

For the record, another option is to compute the median_speed and store it as a field in both the section and the trip. However, we would not have that field for older trips and would still have to use the mean as a workaround. Since we don't have a specific need for median v/s mean, let's just go with mean.

shankari commented 2 years ago

Comparing limits:

If I want to, US DOT has some transportation trends (https://www.transportation.gov/sustainability/climate/transportation-ghg-emissions-and-trends) going back to 2005, but they seem to be consistent with each other. The emissions total, split by type of gas, is 2133 Tg = 2133 MMT (since "one million metric ton is equal to one teragram") = 2.133 Gt. But the emissions per sector are only 1.969 Gt. Adding up the individual entries by mode, we get 0.68 + 0.544 + 0.407 + 0.183 + 0.139 + 0.01 + 0.154 = 2.117. We should really pull out heavy duty trucks from this mix, since those emissions are incorporated into the products that we consume and not into our travel directly. But not sure how to pull that out, since the trends seem to lump heavy duty trucks (which are not relevant for passenger travel) along with buses (which are).

shankari commented 2 years ago

The color coding for the goals, which took so much time, seems to be still broken for non-default values.

Works for default values Incorrectly shows all green if range is large Incorrectly shows all red if range is small
Screenshot_1640139296 Screenshot_1640139368 Screenshot_1640139413
shankari commented 2 years ago

This is because both the userCarbon and the us2050 values are strings.

$scope.carbonData.us2050 = Math.round(14 / 7 * days) + ' kg CO₂';
$scope.carbonData.userCarbon    = FootprintHelper.readableFormat(FootprintHelper.getFootprintForMetrics(userCarbonData));

Let's move the formatting in the HTML throughout the code, as per best practices.

shankari commented 2 years ago

Let's apply https://github.com/shankari/e-mission-phone/commit/a3518fee4d6f5f7243a604712c549eddfee2050f elsewhere as well and see if we can remove the code simplification that we had postponed in https://github.com/e-mission/e-mission-phone/pull/805/commits/6850bd49402c2a4d3cf506bb35aea95d5b973832

shankari commented 2 years ago

Let's think about all the places where we need formatted data:

  1. Footprint card
  2. Calorie card
  3. Summary card
  4. Charts

The first two are easy, so let's fix them first

Edit: fixed in https://github.com/e-mission/e-mission-phone/pull/805/commits/cc404654bafe5b7ce68ac6c179581c1b43615e3c and https://github.com/e-mission/e-mission-phone/pull/805/commits/44baf2214ed5dd743ae97c435dc47a0a493c7747

shankari commented 2 years ago

Making sure that we don't recompute values over and over again in different functions. Simplifying this should improve the performance.

shankari commented 2 years ago

The issue with the last two is that both need to be formatted, but the summary card wants one value per mode while the charts want multiple values per mode. So the format code needs to work on an array in one case and on individual values in the other.

There are a couple of design fixes for this:

  1. pull out the formatters such that they work on one value at a time, which will also simplify the giant bolus of code that is the formatXXX/getSummaryData. We can then call the formatters as the inner value in the loop, or maybe even call it directly from the HTML
  2. first format (to get the values for the chart) and then summarize. aka, call the method to summarize the data on the formatted values

The second option is also likely to run into issues with the list v/s list of lists, and the first option has the ability to both (a) simplify the code, and (b) support direct invocation from HTML. So let's go with option 1 to begin with.

shankari commented 2 years ago

Next, we disable the filtering queries because:

shankari commented 2 years ago

Finished re-enabling the "worst" calculation, which was fairly straightforward. Moving on to the "optimal" calculation, which is trickier.

The previous implementation of optimal was:

This is now complicated by the introduction of the pilot e-bikes.

The overall goal is now:

shankari commented 2 years ago

Note that we were already not handling air properly although we claimed in the calculations that we were. It seems like we need a couple of checks:

I can't think of a way to do this generically without getting significantly more information, potentially from the MITIE project. For now, we leave out the optimal calculation, or make it CanBikeCO specific.

Let's leave out the optimal calculation for now.

shankari commented 2 years ago

Although, the optimal calculation would really be useful to show people how low they can go.

Ok, here's an alternate proposal. We allow one "range-limited motorized vehicle", with the understanding that it would be an MITIE-class vehicle. If it exists, it should specify the max range. If it exists, we break at that range, otherwise we don't. This should work transparently for all evaluations of individual modes, but will break if there are multiple range-limited modes. But let's deal with that later.

shankari commented 2 years ago

Actually, this won't even work, and the optimal footprint was wrong all along. Basically, on the client, we have the summary (at a minimum, per day). We don't have individual trip values. So if we had 10 trips of 1km each, we would see car: 10k on the client. So those 1km car trips should really be replaced by walk/bike trips but we won't see it because car has a value > 5k. This was already a bug on the client, and it is only going to be a bigger bug going forward, where we have more divisions.

Sadly, abandoning the optimal calculation again because it requires way more refactoring than I want to do.

shankari commented 2 years ago

another issue with this approach is that the most efficient mode after the e-bike is scootershare. But using the scootershare intensity as the optimal for a range from 30k to 600k is clearly incorrect. But if we want to avoid that, we need to actually have additional metadata for each mode (is it long-range or range-limited, etc). This seems like a task for ... the long-term "other" mode server.

shankari commented 2 years ago

Having abandoned the optimal calculation for now, let's move on to the range calculations. Our goal here is to come up with a range of values for the footprint. The range comes from modes for which we don't have a mapping. The lower end assumes a mapping of 0 for the unknown modes. The higher end assumes the highest footprint (taxi) for the unknown modes.

shankari commented 2 years ago

Almost done, but need to figure out how the comparison code will work. If we have a range of values, which arrow do we display? How do we format the %?

I'd hoped that the signs on both low and high would be the same, but alas, a fairly early test indicated that they are not always. In this case, the low range was a potential decrease (17 to 17) while the high range was an increase (17 to 28). Might have to redo the entire HTML around the greater/lesser to show both options. There's going to be some super complex ng-if code...

Note also that

<div id="arrow-color" class="icon ion-arrow-up-a"></div>
<div id="arrow-color" class="icon ion-arrow-up-a"></div>

results in arrows one below the other.

Screen Shot 2021-12-23 at 10 43 56 PM
shankari commented 2 years ago

The cases we need to handle are:

shankari commented 2 years ago

Implemented this as a separate directive, and it is not that bad, except for the "or" and "last week" bits in the final screen. They are in the next row somehow.

shankari commented 2 years ago

Range changes are complete (calorie was https://github.com/e-mission/e-mission-phone/pull/805/commits/980bbb4aca7ce265e7887f14de5b5a88bdfd88d6)

With this, all UI changes are complete. Pending potential issues:

Moving on to the final server changes (handling the trips with the expectation.to_label == false by including them in the analyses)

shankari commented 2 years ago

We're going to deal with the expectation.to_label issue by coming up with a new field called final_labels. The algorithm to fill in the final_labels is as follows:

We will then change the metrics and the leaderboard to use final_labels instead of user_input, including changing the code for expand_userinputs

shankari commented 2 years ago

Couple of notes:

Let's start with this approach, and then move on the new field if these don't work for some reason. It will be much harder (wrt backwards compat) to go and rewrite fields for all trips when we do the final design.

shankari commented 2 years ago

For the record, MET source for e-bikes is: https://journals.lww.com/acsm-tj/Fulltext/2021/04150/Metabolic_and_Cardiovascular_Responses_to_a.5.aspx?context=LatestArticles

Alessio, Helaine M., et al. "Metabolic and Cardiovascular Responses to a Simulated Commute on an E-Bike." Translational Journal of the American College of Sports Medicine 6.2 (2021): e000155.

shankari commented 2 years ago

Since we are NREL, we want to also support car vs. e-car. Estimates for the average e-car:

Let's go with 250 WH/mile (0.25 kWH/VMT) since most of the numbers seem to be clustered around there, and 250 is a nice round number 😄

So we will add entries for:

conversions (based on methodology from minipilot paper)

We will continue to use 1166 lb per MWH, consistent with egrid estimate for CO at the time of the CEO mini-pilot https://www.epa.gov/egrid/power-profiler#/RMPA

1,000,000 WH = 1166 lb 250 WH = (250 1166) / 1000000 = 0.2915 lb 125 WH = (125 1166) / 1000000 = 0.1458 lb

e_car_drove_alone: 0.2915 lb/mile e_car_shared_ride: 0.1458 lb/mile

converting to SI units to be consistent with other values:

We want the values in kg/PkmT

0.2915 lb = 1 mile 0.1322 kg = 1.609 km (0.1322 / 1.609) = 0.08216 kg/PkmT (versus 0.00728 for e-bike, so the magnitude seems right)

0.1458 lb = 1 mile 0.0661 kg = 1.609 meters (0.0661 / 1.609) = 0.04108 kg/PkmT (half of drove_alone) so the magnitude seems right

e_car_drove_alone: 0.08216 kg/meter e_car_shared_ride: 0.04108 kg/meter