Closed iantei closed 9 months ago
- I want to see an explanation of the sensed notebooks in the Lao case
I am observing some errors with sensed notebook for Lao case.
Changes:
STUDY_CONFIG=`usaid-laos-ev`
Using vail dataset.
I am getting the following errors:
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/usaid-laos-ev.nrel-op.json
Successfully downloaded config with version 1 for USAID-NREL Support for Electric Vehicle Readiness and data collection URL https://USAID-laos-EV-openpath.nrel.gov/api/
Running at 2023-11-21T00:26:03.065634+00:00 with args Namespace(plot_notebook='generic_metrics_sensed.ipynb', program='default', date=None) for range (
Tried with the main repo, without my changes related with dynamic config. Still getting the same error.
https://github.com/e-mission/em-public-dashboard/issues/89#issuecomment-1738332926 Referring to the above Issue discussion, there are instances of NaN values in the data which is causing this issue.
The results are displayed below with some code changes on scaffolding.py
to debug:
Code changes: ``` participant_ct_df = all_ct[all_ct.user_id.isin(participant_list)] if participant_ct_df.isna().values.any(): print("The DataFrame contains missing values (NaN).") else: print("The DataFrame does not contain any missing values.") print(participant_ct_df.inferred_section_summary[participant_ct_df.inferred_section_summary.isna()]) print("After filtering, found %s participant trips " % len(participant_ct_df)) ``` Results: ``` The DataFrame contains missing values (NaN). 0 NaN 1 NaN 2 NaN 3 NaN 4 NaN ... 57396 NaN 57397 NaN 57398 NaN 57399 NaN 57400 NaN Name: inferred_section_summary, Length: 57401, dtype: object After filtering, found 57407 participant trips ```
@iantei looking through my prior comments to make sure that they are all resolved, I ran into this comment.
Similarly, notebook execution for sensed notebook failed since we have some NaN values in the dataset used - as justified above.
As justified where? The sensed notebooks should not be config specific since we don't use user labels in them.
The conversation was documented in Issue section, https://github.com/e-mission/em-public-dashboard/issues/89#issuecomment-1738332926
For my case where I am using vail
dataset, I am encountering issue which I have enlisted above.
Overall the charts looks identical.
Not really. I still see differences in:
I see your comment around
Note: There is no mapping for "pilot_ebike" to "E-bike". Therefore, these mapping has been transformed into "Others" in "Number of Commute Trips" and "Trip count by Purpose" charts above.
But "Trip count by Purpose" does not display modes, so a difference in the e-bike label should not make any difference. And there is no "other" in the purpose charts anyway. It does cause the large "other" in "Number of trips", so I have not flagged that.
Also, "generic_metrics" as an example (note the e.g. before it). I would like to see at least a few metrics from the mode specific and energy/emission notebooks.
I still want to see all the outputs (e.g. generic metrics) one last time to make sure that there are no regressions
Executed the notebook "Mode Specific Metrics" for both the STUDY_CONFIG for comparison between default mapping and dynamic config.
Steps executed:
A. For Default Mapping (STUDY_CONFIG=stage-program):
```
(emission) root@1d89306b8d05:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/update_mappings.py mapping_dictionaries.ipynb
(emission) root@1d89306b8d05:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
label_options is unavailable for the dynamic_config in stage-program
Running at 2023-11-24T19:16:46.712768+00:00 with args Namespace(plot_notebook='mode_specific_metrics.ipynb', program='default', date=None) for range (
All the charts in both Default Mapping
and Dynamic Config
for Mode Specific Metrics
looks identical.
Executed the notebook "Energy Calculations" for both the STUDY_CONFIG for comparison between default mapping and dynamic config.
Steps executed:
A. For Default Mapping (STUDY_CONFIG=stage-program):
```
(emission) root@1d89306b8d05:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py energy_calculations.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/stage-program.nrel-op.json
Successfully downloaded config with version 1 for Staging environment for testing programs only and data collection URL https://openpath-stage.nrel.gov/api/
label_options is unavailable for the dynamic_config in stage-program
Running at 2023-11-24T19:24:31.229825+00:00 with args Namespace(plot_notebook='energy_calculations.ipynb', program='default', date=None) for range (
The computation for Default Mapping
is based on auxiliary_files/energy_intensity
while the computation for Dynamic Config
is based on https://github.com/e-mission/nrel-openpath-deploy-configs/blob/main/label_options/example-program-label-options.json.
The difference in these emissions have been documented here in the Issue section. https://github.com/e-mission/em-public-dashboard/issues/89#issuecomment-1750013681
The explanation for this is documented above. https://github.com/e-mission/em-public-dashboard/pull/91#issuecomment-1788482220
> > Overall the charts looks identical. > > Not really. I still see differences in: > > 1. Trip count by purpose > > I see your comment around > > > Note: There is no mapping for "pilot_ebike" to "E-bike". Therefore, these mapping has been transformed into "Others" in "Number of Commute Trips" and "Trip count by Purpose" charts above. > > But "Trip count by Purpose" does not display modes, so a difference in the e-bike label should not make any difference. And there is no "other" in the purpose charts anyway. It does cause the large "other" in "Number of trips", so I have not flagged that. For Trip count by Purpose, we have the following labels (identical in both Default Mapping and Dynamic Config) with the numeric representation of each labels | Purpose | Number | |--------|--------| | Home | 4458 | | Recreation/ Exercise | 2575 | | Work | 4655 | | Entertainment/ Social | 1306 | | Shopping | 2156 | | Personal/ Medical | 947 | | Meal | 847 | All these labels are available on both `Default Mapping` and `Dynamic Config`. Whilst in the `Dynamic Config`, few more labels are introduced as. `"At Work"`, `"Other"`, `"Pickup / Drop off item"`, `"Pickup / Drop off Person"`. These are equivalent to the `"Other"` label present in the `default mapping.` Further clarification showing comparison between two chart's label and it's value representation. | Default Mapping | Dynamic Config | |--------|--------| | ![Screenshot 2023-11-28 at 3 15 45 PM](https://github.com/e-mission/em-public-dashboard/assets/26689347/60e8e40f-c148-407d-ac50-8fce30f79116) | ![Screenshot 2023-11-28 at 3 17 07 PM](https://github.com/e-mission/em-public-dashboard/assets/26689347/ffe9de59-da6b-4764-8b79-7a10444f8fc4) | As mentioned above, "Home", "Recreation/ Exercise", "Work" - "To Work", "Entertainment/Social", "Shopping", "Personal/Medical" and "Meal" have same corresponding values. Lets understand the 'Other' label in the pie-chart. ``` For Default: Other : 4360(Other) + 431 (not_a_trip) + 388 (School) + Pick-up/Drop off (163) + Religious (154) + Transit Transfer (114) = 5610 For Dynamic Config: Other: 1649 (Other) + 388 (School) + Access Recreation (211) + Religious (154) + Transit Transfer (114) = 2516. ``` For Dynamic Config, we have distinct new labels represented on the chart as: `Pick-up/ Drop off Person` and `Pick-up/ Drop off Item` > 2. Trip count under 10 miles is comparing modes to purpose I understand this comment is in reference with "Mode choice for Trips under 10 miles". There was a wrong comparison in the charts above, which I have already updated. Now, explanation for the different in representation of the charts. Code snippet to show we're using `Mode_confirm`. ``` labels_d10 = expanded_ct.loc[(expanded_ct['distance_miles'] <= 10)].Mode_confirm.value_counts(dropna=True).keys().tolist() values_d10 = expanded_ct.loc[(expanded_ct['distance_miles'] <= 10)].Mode_confirm.value_counts(dropna=True).tolist() ``` The difference in representation is because of the following reasons. For `Default Mapping` we have the following mapping (as in [mode_labels.csv](https://github.com/e-mission/em-public-dashboard/blob/main/viz_scripts/auxiliary_files/mode_labels.csv) ): ``` pilot_ebike,pilot_ebike,E-bike e-bike,e-bike,E-bike ``` while with the `Dynamic Config`, we just have (as in [example_program_label_option)](https://github.com/e-mission/nrel-openpath-deploy-configs/blob/main/label_options/example-program-label-options.json) ``` {"value":"e-bike", "baseMode":"E_BIKE", "met": {"ALL": {"range": [0, -1], "mets": 4.9}}, "kgCo2PerKm": 0.00728}, ... "translations": { "en": { "walk": "Walk", "e-bike": "E-bike", ... } ``` in the reference example of config which I used to test. [https://github.com/e-mission/nrel-openpath-deploy-configs/blob/main/label_options/example-program-label-options.json](https://github.com/e-mission/nrel-openpath-deploy-configs/blob/main/label_options/example-program-label-options.json) Comparing labels and their corresponding values: | Default Mapping | Dynamic Config | |--------|--------| | ![Default_under10miles](https://github.com/e-mission/em-public-dashboard/assets/26689347/423aa012-d7df-4e66-89a1-e5991c80ac48) | ![Dynamic_under10miles](https://github.com/e-mission/em-public-dashboard/assets/26689347/b3b3a4bf-f829-4300-a20c-21e52f639900) | All the labels have same values except `Other` and `E-bike`. Now, let us evaluate the sum of these labels. For Default: `3853 (E-bike) + 180 (Other) = 4033` For Dynamic: `25 (E-bike) + 4008 (Other) = 4033` Sum of `Other` and `E-bike` labels in both of these cases give the total value of `4033`. This shows that because of the change in label for `E-bike` in the example config used for testing [program_label_options](https://github.com/e-mission/nrel-openpath-deploy-configs/blob/main/label_options/example-program-label-options.json), there is difference in representation. But the representation in both the charts are correct. > 3. Trip miles by mode The addition of `Air` mode has resulted in this difference of representation. Comparison of Average (miles) for `Mode_confirm` between `Default Mapping` vs `Dynamic Config` | Default Mapping | Dynamic Config | |--------|--------| | ![DefaultMapping_TripMilesByMode](https://github.com/e-mission/em-public-dashboard/assets/26689347/75850cf4-642c-4d27-b301-7a7bc250c67e)| ![DynamicConfig_TripMilesByMode](https://github.com/e-mission/em-public-dashboard/assets/26689347/fc0e9aa3-b1ca-49f0-a8df-8426e5128f5e) | The Y-axis represents of Average(miles), with the Mode_confirm represented on the X-axis. As we can see from above, we have difference in `Other` and `E-Bike`. We didn't have `Air` label in `Default Mapping` which is available for the example `Dynamic Config`. Because of the no mapping translation from `pilot_ebike` to `E-Bike`, the average miles by `E-Bike` commute has dropped and added into `Other` label. Therefore, there is a difference in representation of these values in the chart. Re-checking, the total miles in both of these charts. Adding up the total `Total (miles)` in both of these gives same value as `142600.44744852`. > 4. Average trip length The representation includes `Air` mode, which has huge number of trip length. Therefore, the other modes are not represented as previously. > Also, "generic_metrics" as an example (note the e.g. before it). I would like to see at least a few metrics from the mode specific and energy/emission notebooks. > > > I still want to see all the outputs (e.g. generic metrics) one last time to make sure that there are no regressions Updated the comparison charts between mode specific and energy emissions notebooks above.
I am not going to hold up the merge for this, but in the future, please test on a dataset with > 1 user and > 25 confirmed e-bike trips.
https://github.com/e-mission/em-public-dashboard/pull/91#issuecomment-1826048250 and https://github.com/e-mission/em-public-dashboard/pull/91#issuecomment-1826055439
I am not sure we would have caught all the errors earlier with such a small dataset. Please update the outputs into this PR after the merge with results from the larger dataset.
Please also note that I am squash-merging here, so please account for that while pulling post-merge.
…de_confirm mapping with dynamic labels.