Integrate common base mode colors into public dashboard

louisg1337 commented 1 month ago

Added in support for e-mission-common into em-public-dashboard, and then utilized e-mission-common's base mode colors in the dashboard to unify colors across platforms.

louisg1337 commented 1 month ago

Update

After a bit of time trying to find my way around the public dashboard, I managed to...

1) Get e-mission-common installed for use by the visualization scripts on the dev environment. 2) Modify the mapping_color_labels function to map combined_mode_values and sensed_values to the colors in e-mission-common.

The way I mapped the colors was a bit weird as I ran into the following issue. In mapping_color_labels, here is an example of the type of modes we are expected to map to colors:

combined_mode_values:
['Gas Car, drove alone', 'E-car, drove alone', 'Bus', 'Train', 'Free Shuttle', 'Taxi/Uber/Lyft', 'Gas Car, with others', 'E-car, with others', 'Bikeshare', 'Scooter share', 'E-bike', 'Walk', 'Skate board', 'Regular Bike', 'Not a Trip', 'No Travel', 'Same Mode', 'Other']

sensed_values:
['WALKING', 'BICYCLING', 'IN_VEHICLE', 'AIR_OR_HSR', 'UNKNOWN', 'OTHER', 'Other']

Here are the BASE_MODES found in e-mission-common that we will be using for the mapping:

BASE_MODES =

```py BASE_MODES = { # BEGIN MotionTypes "IN_VEHICLE": { "icon": 'speedometer', "color": mode_colors['red'], "met": NON_ACTIVE_METS, # footprint not known; left undefined. later filled in by an average of: # CAR, BUS, LIGHT_RAIL, TRAIN, TRAM, SUBWAY }, "BICYCLING": { "icon": 'bike', "color": mode_colors['green'], "met": BIKING_METS, "footprint": {}, }, "ON_FOOT": { "icon": 'walk', "color": mode_colors['blue'], "met": WALKING_METS, "footprint": {}, }, "UNKNOWN": { "icon": 'help', "color": mode_colors['grey'], # met and footprint not known; left undefined }, "WALKING": { "icon": 'walk', "color": mode_colors['blue'], "met": WALKING_METS, "footprint": {}, }, "AIR_OR_HSR": { "icon": 'airplane', "color": mode_colors['orange'], "met": NON_ACTIVE_METS, "footprint": AIR_FOOTPRINT, }, # END MotionTypes "CAR": { "icon": 'car', "color": mode_colors['red'], "met": NON_ACTIVE_METS, "footprint": CAR_FOOTPRINT, }, "E_CAR": { "icon": 'car-electric', "color": mode_colors['pink'], "met": NON_ACTIVE_METS, "footprint": E_CAR_FOOTPRINT, }, "PHEV_CAR": { "icon": 'car-electric', "color": mode_colors['pink'], "met": NON_ACTIVE_METS, "footprint": PHEV_CAR_FOOTPRINT, }, "E_BIKE": { "icon": 'bicycle-electric', "color": mode_colors['green'], "met": E_BIKING_METS, "footprint": E_BIKE_FOOTPRINT, }, "E_SCOOTER": { "icon": 'scooter-electric', "color": mode_colors['periwinkle'], "met": NON_ACTIVE_METS, "footprint": E_SCOOTER_FOOTPRINT, }, "MOPED": { "icon": 'moped', "color": mode_colors['green'], "met": NON_ACTIVE_METS, "footprint": MOPED_FOOTPRINT, }, "TAXI": { "icon": 'taxi', "color": mode_colors['red'], "met": NON_ACTIVE_METS, "footprint": TAXI_FOOTPRINT, }, "BUS": { "icon": 'bus-side', "color": mode_colors['magenta'], "met": NON_ACTIVE_METS, "footprint": { "transit": ["MB", "RB", "CB"] }, # fixed-route bus, bus rapid transit, commuter bus }, "AIR": { "icon": 'airplane', "color": mode_colors['orange'], "met": NON_ACTIVE_METS, "footprint": AIR_FOOTPRINT, }, "LIGHT_RAIL": { "icon": 'train-car-passenger', "color": mode_colors['periwinkle'], "met": NON_ACTIVE_METS, "footprint": { "transit": ["LR"] } # light rail }, "TRAIN": { "icon": 'train-car-passenger', "color": mode_colors['periwinkle'], "met": NON_ACTIVE_METS, "footprint": { "transit": ["LR", "HR", "YR", "CR"] } # light rail, heavy rail, hybrid rail, commuter rail }, "TRAM": { "icon": 'tram', "color": mode_colors['periwinkle'], "met": NON_ACTIVE_METS, "footprint": { "transit": ["SR"] } # streetcar }, "SUBWAY": { "icon": 'subway-variant', "color": mode_colors['periwinkle'], "met": NON_ACTIVE_METS, "footprint": { "transit": ["HR"] } # heavy rail }, "FERRY": { "icon": 'ferry', "color": mode_colors['taupe'], "met": NON_ACTIVE_METS, "footprint": { "transit": ["FB"] } # ferry boat }, "TROLLEYBUS": { "icon": 'bus-side', "color": mode_colors['taupe'], "met": NON_ACTIVE_METS, "footprint": { "transit": ["TB", "SR"] } # trolleybus, streetcar }, "UNPROCESSED": { "icon": 'help', "color": mode_colors['grey'], # met not known; left undefined # footprint not known; left undefined }, "OTHER": { "icon": 'pencil-circle', "color": mode_colors['taupe'], # met not known; left undefined # footprint not known; left undefined }, }; ```

While there is a clear mapping between values in sensed_values and BASE_MODES, not all values in combined_mode_values have a clear mapping to BASE_MODES (i.e. Gas Car, drove alone vs CAR, E-car, drove alone vs E_CAR). My solution to this was to trim the modes in combined_mode_values to only keep the string on the left side of a comma (Gas Car, drove alone --> Gas Car), which got us closer to the value we cared about. I then used the difflib package from python and called get_close_matches, which scans BASE_MODES to find the closest match, giving us the color we want. This was unfortunately a bit of a long winded way to do this, but I wasn't too sure what other way I could do this without hard coding cases.

Results

Below is a plot generated using the new color mappings.

Note: As of right now I only tested the new colors with labeled data as that is what was provided using the data from the "loading test data" tutorial. I tried downloading my own json dumps for some sensed data, but we never fixed that json dump issue, so I wasn't able to. That could be a good thing to work on next after this.

Tasks Remaining

1) Using my current mapping implementation, Regular Bike maps to OTHER instead of BICYCLING. Going to need to figure out how to fix that, maybe I have to hardcode it because even Bike --> BICYCLING is a stretch. 2) Figure out how to install e-mission-common into the production version, as I am not sure if the changes I made will reflect in the production environment. I am a bit confused about the difference between the production and dev environment as a whole, especially since the production docker compose file doesn't specify the need to run the Dockerfile (viz_scripts/docker/Dockerfile) which has the pip install e-mission-common in it, unlike how the dev docker compose does.

Further Public Dashboard Questions

I managed to get this fix in, but I still have some questions wrt how the dashboard works.

1) How do I generate all the plots easily? Right now I've been running each notebook individually, but I keep on running into issues where example_function() doesn't exist, so I have to find example_function(), run that notebook first, and then I can run what I want. I just feel like there is an easier option that I am missing. 2) This is more out of curiosity, but how does all of this get setup for production? Is all of the software launched using the production docker-compose.yml, like it is for development, and then hosted on a server somewhere? This whole setup is a super interesting way to tie multiple parts of e-mission together in an elegant way.

shankari commented 1 month ago

@Abby-Wheelis for visibility and review

Abby-Wheelis commented 1 month ago

How do I generate all the plots easily?

If you are using the dev dockerfile, you need to run the notebooks in the notebooks to generate the plots (always run the mapping notebook first). You can also use the non-dev dockerfile, and then run the notebooks from the command line: there are some instructions here, or I'm happy to show you. You'll still need to run the mapping notebook first.

Figure out how to install e-mission-common into the production version

I think we already use e-mission-common in the survey notebook, would you be able to use it here in a similar way?

For the matching process, some deployments have custom labels, can you check and make sure that your process will work with those labels too?

louisg1337 commented 3 weeks ago

I met up with @Abby-Wheelis yesterday and she helped clarify a lot of the confusions I had with the public dashboard, along with giving me some great ideas on how to handle custom labels.

Custom Labels

For the custom labels, instead of doing it in the hacky pattern matching way I previously used, her idea was to use the baseMode field in the unprocessed dynamic_labels variable for the mapping. This was a great idea as we can now be sure that no matter what custom label there is, the mapping will be correct. That change can now be found in my latest pushed commit.

I have only been able to test my code by hardcoding in the dynamic_labels variable using values from nrel-openpath-deploy-configs/label_options, and then manually checking the mapping to ensure the colors were matched correctly. Once we get the permissions issue cleared up with the mongo dumps I can try it using actual data and generate some plots to see how it looks.

Default Labels

I am still not quite happy with how I handle mapping the default labels to colors, but I don't know if there is a better way to do it. If there are no custom labels, we default to using the labels found in viz_scripts/auxillary_files/mode_labels.csv, which doesn't include a baseMode column. This is why I did my hacky pattern matching solution as there was no clear mapping between the labels provided and the baseModes we have. The only alternative I can think of is to add a baseModes column into mode_labels.csv, but that seems even worse. I'll keep it how it is for the time being as it works, but it could be worth refactoring if we figure out a better way to do it.

Dependency Issue

In terms of utilizing e-mission-common in the dashboard, Abby filled me in on the fact that e-mission-server actually installs it as a dependency, so I got rid of my code that installs it directly on the dashboard. I did notice, however, that e-mission-server is using e-mission-common@0.5.3, which does not include the base mode features as that was released in 0.5.4. So before these changes go out we need to bump that up in e-mission-server. I can make a PR for that, but I haven't worked on the server before so I am not sure how to test the server to make sure the newer release doesn't break it.

louisg1337 commented 3 weeks ago

I was finally able to test out the new colors using some better mongo dumps. Below are how all of the plots in generic_metrics.ipynb look using datasets from the Laos and Ride 2 Own studies. I chose those two datasets so we can check to make sure both the dynamic and default labels appear correctly. Let me know what you think about these @Abby-Wheelis and if you'd like to see any others!

Laos Data (Dynamic Labels)

distribution_of_modes

distribution_modes_commute

mode_choice_trips_under_80_pct

total_trip_length_by_each_mode

total_trip_length_each_land_mode

Ride 2 Own (Default Labels)

r2o_1

r2o_2

r20_3

r2o_4

r2o_5

Abby-Wheelis commented 3 weeks ago

@louisg1337 The labels look like they're working, so that's good!

Is there anything we can do to make the variations on the same base mode different from one another? In the in-app dashboard I think something like "gas car with others" would show up as a lighter/darker shade of the same red as "gas car drove alone"

iantei commented 3 weeks ago

@louisg1337

The only alternative I can think of is to add a baseModes column into mode_labels.csv, but that seems even worse. I'll keep it how it is for the time being as it works, but it could be worth refactoring if we figure out a better way to do it.

I had a discussion with @JGreenlee yesterday. In regards with the aspect of Default Labels not having label_options, and it's dependency on using mode_labels.csv. Would it be a good idea to host a default label_options in the e-mission-commons repo itself? With that we can make use of richModes/baseModes without having to rely on either transforming the mode_labels.csv or the using current approach which you've suggested might need some refactoring.

I did notice, however, that e-mission-server is using e-mission-common@0.5.3, which does not include the base mode features as that was released in 0.5.4

We could make the following changes in the local system, in the environment36.dashboard.additions.yml file in Public Dashboard to make use of the latest changes on the e-mission-common repo.

dependencies:
- pip:
  ...
  - git+https://github.com/JGreenlee/e-mission-common@master

Then re-build the docker with docker-compose -f docker-compose.dev.yml build

JGreenlee commented 3 weeks ago

I just added https://github.com/JGreenlee/e-mission-common/blob/master/src/emcommon/resources/label-options.default.json to e-mission-common, which should help with this

The .csv files will phased out

JGreenlee commented 3 weeks ago

Is there anything we can do to make the variations on the same base mode different from one another? In the in-app dashboard I think something like "gas car with others" would show up as a lighter/darker shade of the same red as "gas car drove alone"

Yes, we made dedupColors for this purpose on e-mission-phone https://github.com/e-mission/e-mission-phone/blob/90595e9d2a5cdb4acbd47e8e4f224610f56c832d/www/js/components/charting.ts#L165

Ideally we want this to be moved to e-mission-common so both dashboards use the same dedup method. But the current dedupColors depends on the JS color library; we need a Python equivalent

JGreenlee commented 3 weeks ago

It seems like Python's built-in colorsys will be good enough for this https://stackoverflow.com/a/60562502/5110347 So we won't need an extra dependency

louisg1337 commented 3 weeks ago

I had a discussion with @JGreenlee yesterday. In regards with the aspect of Default Labels not having label_options, and it's dependency on using mode_labels.csv. Would it be a good idea to host a default label_options in the e-mission-commons repo itself? With that we can make use of richModes/baseModes without having to rely on either transforming the mode_labels.csv or the using current approach which you've suggested might need some refactoring.

Yes, this helps tremendously, thanks for suggesting this change! This definitely helps streamline the code and makes it way more consistent. Thank you also @JGreenlee for also implementing it, that'll be the next thing I try to work in.

We could make the following changes in the local system, in the environment36.dashboard.additions.yml file in Public Dashboard to make use of the latest changes on the e-mission-common repo.

Perfect, that sounds like a great way to deal with that issue!

Yes, we made dedupColors for this purpose on e-mission-phone https://github.com/e-mission/e-mission-phone/blob/90595e9d2a5cdb4acbd47e8e4f224610f56c832d/www/js/components/charting.ts#L165

So I was actually working on that this morning, I just pushed out a working version that allows us to lighten each occurrence of the same color. Here is one plot that showcases how it looks.

new_colors

I definitely do agree though that this should be included in e-mission-common, maybe we can use the code I just pushed out as a baseline to work off of as I also utilized the colorsys package. Also, I do think my implementation will need to be refactored slightly, as I saw in dedupColors it supports infinite occurrences of the same color, while mine only supports 5 shades lighter.

JGreenlee commented 3 weeks ago

Looks great! We can unify the dedupe methods at a later time

louisg1337 commented 3 weeks ago

Also, I do think my implementation will need to be refactored slightly, as I saw in dedupColors it supports infinite occurrences of the same color, while mine only supports 5 shades lighter.

Until we unify this, I figured I'd try to fix the above issue. I took a slightly different approach from the dedupColors, which is instead of evenly spacing out an adjustment per color, I based the adjustment off of an inverse exponential decay function. I decided to do it this way for a few reasons.

1) It was just an easier change to make than to have to refactor everything which I would've had to do if I took the same approach found in dedupColors. 2) The function can handle an infinite amount of occurrences as it will always stay within the max adjustment value set.

Like stated above, this will be refactored so none of this is set in stone, its just a placeholder for the time being.

We could make the following changes in the local system, in the environment36.dashboard.additions.yml file in Public Dashboard to make use of the latest changes on the e-mission-common repo.

I just implemented this in the latest push, and I've run into an issue. When I try to run the python notebooks, I get this error:

---> 73 async def get_egrid_region(coords: list[float, float], year: int) -> str | None:
     74     """
     75     Get the eGRID region at the given coordinates in the year.
     76     """
     77     if year < 2018:

TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'

The error is apparently because the str | None syntax is only available in Python 3.10 (StackOverflow). Unfortunately, we are using Python 3.9 in this docker environment. I'm not sure if the solution to this is to replace the syntax in e-mission-common or to bump up the version of Python. If it is the latter, I am also not quite sure how to do that. My impression is that Python 3.9 gets installed using the e-mission-server docker image, and if that is the case thats a whole other can of worms to deal with trying to bump up the version there. Would love to hear some feedback of what you guys think!

JGreenlee commented 3 weeks ago

I think it just needs from __future__ import annotations

I should configure the test environment in e-mission-common to always use Python 3.9 so we can catch these kinds of issues sooner

JGreenlee commented 3 weeks ago

if from __future__ import annotations doesn't work, feel free to remove the type hints – they are nice to have but definitely not critical

louisg1337 commented 2 weeks ago

Thanks for the help with trying to figure that all out @JGreenlee ! It turns out the from __future__ import annotations doesn't work which is unfortunate. I created a PR here that removes the str | None type hints from e-mission-common, which ended up fixing the issue.

Here are also the plots generated with the new coloring logic I used. Let me know what you guys think about it and whether or not the differences between duplicate colors are noticeable enough. If it isn't enough, I can try to refactor it a bit to get more of a difference between colors.

ntrips_commute_mode_confirm_default ntrips_total_default ntrips_under80_default total_trip_length_default total_trip_length_land_default

JGreenlee commented 2 weeks ago

In this new version, they look a bit too similar to me. I vaguely remember going through this same process on the phone, which is what led me to design dedupColors the way I did

dedupColors makes the colors as far apart as it can with respect to the number of duplicates there are. With 2-3 duplicates the colors are easily distinguishable; with 4-5+ duplicates they become harder to distinguish (but still distinct)

If you implement dedupe_colors in Python you should be able to use it on the dashboard. Something like:

colors_mode = dedupe_colors([
    [mode, translations_to_basemodes.get(mode, "UNKNOWN")['color']]
    for mode in combined_mode_values
])

louisg1337 commented 2 weeks ago

Thats a great point @JGreenlee. I definitely agree the colors are a bit too similar, so I took your advice and implemented dedupe_colors in e-mission-common. The PR for that can be found here. I'm not sure how we are going to convert the python code to javascript as we are using the colorsys python library, so I kept it as a draft until we figure that out. I attached some plots below to showcase how the new colors look, let me know what you guys think. If need be, I can always adjust the variance in lightness that we can go so we can get more drastic differences.

Normal Plot

download

Three Reds + Two Blues At Once

three_red

Four Greens At Once

four_greens

Four Pinks At Once

four_pinks

iantei commented 2 weeks ago

I just added https://github.com/JGreenlee/e-mission-common/blob/master/src/emcommon/resources/label-options.default.json to e-mission-common, which should help with this

The .csv files will phased out

The above label_options.default.json doesn't have "translations" object. We need the translations for mapping from MODE.value to mode_clean.

For example: MODE.value = e_car_drove_alone which has baseMode = "E_CAR" to mode_clean = "E-car, drove alone".

Also, I think it would be a good idea to match all the values in mode_labels.csv, like train_and pilot e-bike, which translates to "Train".

JGreenlee commented 2 weeks ago

On the phone, when translations are not included in the label options, they are taken from the language JSON file https://github.com/e-mission/e-mission-phone/blob/adb7cf235ded412119581a1ef8bf656d0ec93855/www/i18n/en.json#L160 These are the text that gets shown to the user.

I don't think this lines up with mode_clean, though

JGreenlee commented 2 weeks ago

I don't understand where the mode_clean values comes from. A mode like train_and pilot e-bike appears to have been a custom label. So did someone manually map it to "Train" at some point?

I had thought that the public dashboard just lumped custom modes into "Other"

Abby-Wheelis commented 2 weeks ago

I realize I'm jumping in mid-process here, so please feel free to correct me if I've misinterpreted!

So did someone manually map it to "Train" at some point?

Yes, I think that is the case, I remember these being useful when I was processing the canbike CO data, so I'm guessing these date from the early stages of the dashboard, but all other custom labels get lumped into "Other", so I think we're safe to do that with all user-provided labels.

As for where they come from - that depends if we're talking about "default" or "custom" labels.

Default labels currently get read from csvs in: mapping_dictionaries.ipynb which are then used as parameters to the functions in scaffolding.py that process the data (ie expanded_ct['Mode_confirm'] = expanded_ct['mode_confirm'].map(dic_re)). It sounds like this is what we're looking to phase out in favor of a single source of default labels/label translations, which makes a lot of sense to me.

If there are custom labels for the deployment, we pass those in as dynamic_labels and use them instead of the mapping dictionaries. If we wanted to centralize the labels, and get rid of our csvs, I think that would not be too hard to do, we could refactor mapping_dictionaries.py.

The above label_options.default.json doesn't have "translations" object. We need the translations for mapping from MODE.value to mode_clean.

I see what you mean, for dynamic labels we expect the translation objects, and for the default labels they're hardcoded into the csvs.

On the phone, when translations are not included in the label options, they are taken from the language JSON file

Can you remind me where this comes from? If it comes from e-mission-translate then that's not something that we currently use on the dashboard, but could probably build in, we just don't offer any other parts of the dashboard in >1 language. Else maybe we could use the default files from nrel-openpath-deploy-configs when a custom file is not specified? Or is the default going to be stored in em-common and not nrel-openpath-deploy-configs?

JGreenlee commented 2 weeks ago

Can you remind me where this comes from?

For English, it comes from e-mission-phone, www/i18n/en.json, "multilabel". https://github.com/e-mission/e-mission-phone/blob/adb7cf235ded412119581a1ef8bf656d0ec93855/www/i18n/en.json#L160

The other languages are in e-mission-translate.

I think we have a few options:

Have em-public-dashboard retrieve en.json from e-mission-phone
Extract the English "multilabel" translations to a new file (multilabel_en.json?) and put it in e-mission-common. The other languages would stay in e-mission-translate
Instead of having "multilabel" in en.json, es.json etc, always include translations in the label-options json
Move en.json to e-mission-translate and have the public dashboard retrieve en.json from there

JGreenlee commented 2 weeks ago

As for where they come from - that depends if we're talking about "default" or "custom" labels.

Can we get rid of the concept of "default" labels? I think there should be 2 kinds of labels: 1) labels specified by the deployment, and 2) labels given by the user

Abby-Wheelis commented 2 weeks ago

Can we get rid of the concept of "default" labels? I think there should be 2 kinds of labels: 1) labels specified by the deployment, and 2) labels given by the user

I agree that would make the most sense, but unless we update all of the configs, we're left with 1) deployments with their own file in nrel-openpath-deploy-configs/label_options and 2) those who specify Mulitlabel, but not their own labels (so we use our defaults)

Instead of having "multilabel" in en.json, es.json etc, always include translations in the label-options json

Could we use example-study-label-options.json and example-program-label-options.json as the files if the deployer does not specify their own? This would definitely streamline the process on the public dashboard end. These also currently include translations, does that mean we have translations in multiple places for the same set of labels?

louisg1337 commented 2 weeks ago

Can we get rid of the concept of "default" labels? I think there should be 2 kinds of labels: 1) labels specified by the deployment, and 2) labels given by the user

This would definitely be ideal and is probably how we should do it, but Abby is right with how we may still need "default" labels for any past study that never specified any.

Could we use example-study-label-options.json and example-program-label-options.json as the files if the deployer does not specify their own?

I think this would be a great idea as having everything together in one file just makes things 1) a lot easier to work with and 2) a lot easier for any future developer to understand instead of having to track down multiple files. It will also follow the same format that we have for specifying dynamic labels, so that would be good to be consistent across the board.

JGreenlee commented 2 weeks ago

These also currently include translations, does that mean we have translations in multiple places for the same set of labels?

Yes, and avoiding this repetition was the rationale for supporting translations from en.json, es.json, etc, rather than always putting all translations in the label-options JSON. We wanted to centralize all the translations as much as possible, this way future deployers would only have to provide translations for modes that we have not already supported in the past.

Could we use example-study-label-options.json and example-program-label-options.json as the files if the deployer does not specify their own?

Yes, this seems functionally the same as adding the translations to label-options.default.json. If we do this, there is no reason to still have them in en.json, es.json, etc.

I think the 'example' label options are intended as more of a resource for deployers to reference when creating their own label options, rather than to be used as an actual default.

But maybe we should just decide between "default" and "example" and use one for both.

JGreenlee commented 2 weeks ago

How does this approach sound?

Add "translations" to label-options.default.json in e-mission-common, taking them from en.json, es.json, and lo.json
Remove example-study-label-options.json and example-program-label-options.json (they are already out of date and we don't want to maintain multiple things). In the README of nrel-openpath-deploy-configs, include a link to label-options.default.json, describing how deployers can use it as an example
When reading a deployer-provided label options on the phone or public dashboard, check for translations 1) in that same file's "translations"; 2) in label-options.default.json "translations"

iantei commented 2 weeks ago

Add "translations" to label-options.default.json in e-mission-common, taking them from en.json, es.json, and lo.json

Is there any reason why we would want to host the label-options.default.json in e-mission-common instead of nrel-openpath-deploy-configs's label_options? We already have other label_options files in nrel-openpath-deploy-configs. It would likely be a good idea to store all label_options, even default one, in a single place.

louisg1337 commented 2 weeks ago

For the meantime, I've created a PR for e-mission-common which includes adding in the translations to label options. I just wanted to get this in there so I can start working on getting that implemented in case this is the approach we go for.

JGreenlee commented 2 weeks ago

Is there any reason why we would want to host the label-options.default.json in e-mission-common instead of nrel-openpath-deploy-configs's label_options?

The only reason I can think of is that resources from nrel-openpath-deploy-configs get downloaded at runtime, while resources from e-mission-common can be included as a static dependency. Keeping it in e-mission-common avoids an extra network call.

iantei commented 2 weeks ago

The only reason I can think of is that resources from nrel-openpath-deploy-configs get downloaded at runtime, while resources from e-mission-common can be included as a static dependency. Keeping it in e-mission-common avoids an extra network call.

I see. We are doing the network call to download label_options whenever we have custom labels for a program. So, it's a tradeoff between keeping all the label_options at a single place vs an extra network call.

louisg1337 commented 2 weeks ago

Default label options issues

I've integrated the use of label_options.default.json into the code, and I am now running into an issue unfortunately. When I try to generate the new plots, I get this error:

error

I found out this is because some of the translation values in label_options.default.json don't line up with the values being used to store the data. Therefore, when matplotlib tries to map keys to colors, it throws errors. The differences between the data are subtle too, here are some of them.

| emcommon labels | current labels | | ---- | ---- | Gas Car Shared Ride | Gas Car, with others Gas Car Drove Alone | Gas Car, drove alone Taxi / Uber / Lyft | Taxi/Uber/Lyft ... | ...

There are a few more instances of this, especially with the E-car labels. Would it be alright if we changed the translation values in label_options.default.json to make them consistent with what was there before @JGreenlee ?

Dedupe colors issues

With the new dedupe_color( ) in e-mission-common, the colors are appearing too dark. Here is what one of the plots look like.

color_error

Gas Car, drove alone and E-car, drove alone are both way too dark. I think that max_adjustment needs to be tweaked a bit so that the range goes from like [0.4, 1.6] to something like [0.8, 2]. Going less than 1 makes the colors dark pretty fast, so I don't know if we should venture that far below 1. And then 2 was the max I used in my previous implementation, and it seemed to be a sweet spot where the color was light, but not too light where it was white. Let me know what you think about both of these @JGreenlee.

shankari commented 2 weeks ago

I found out this is because some of the translation values in label_options.default.json don't line up with the values being used to store the data. Therefore, when matplotlib tries to map keys to colors, it throws errors. The differences between the data are subtle too, here are some of them.

The fundamental error here is that the colors are mapped to the display modes. They should be mapped from the internal mode (e.g. drove_alone, not Gas Car, drove alone). The base mode is the internal representation and is the one that should be used for all calculations. It was a bad design to do the mapping from the display mode in the first place.

What if we want to support multiple languages in the dashboard as well?

JGreenlee commented 2 weeks ago

Both of these issues are just growing pains from trying to unify code between phone & dashboard.

The label options need to be what they are in label-options.default.json. These translations will be used on the phone for the text that gets displayed to the user, so we can't just change it to match the previous implementation of the public dashboard.

The dedupe_color issues are because the charts on the phone don't have any text on the bars, while the dashboard does have (black) text on the bars. So the dashboard cannot afford to darken colors without compromising legibility, but the phone can.

I think we can get around this pretty easily by having max_adjustment be a parameter to dedupe_colors. This way the phone can keep using [0.4, 1.6], while the public dashboard can use something like [1, 2] or whatever you determine to be a good tradeoff of distinguishability between colors and legibility of text. You can make it an optional parameter with a default value of [1, 2] if you'd like the default behavior to be only "lightening" and not "darkening"

louisg1337 commented 2 weeks ago

The fundamental error here is that the colors are mapped to the display modes. They should be mapped from the internal mode (e.g. drove_alone, not Gas Car, drove alone). The base mode is the internal representation and is the one that should be used for all calculations. It was a bad design to do the mapping from the display mode in the first place.

This makes sense and thats a great point, mapping it to the internal representation would be better, especially for supporting future translations. I guess where I am still confused is if the plots showcase display modes as the labels, don't we need a display mode to color mapping for the colors to get applied correctly? The error I have been getting is that certain keys don't exist in the color mapping because the are slightly different in label-options.default.json, so I assumed that was how it needed to be done.

Is there a different way to map the colors to the labels in matplot that I've been overlooking? Maybe something like the label values are stored as (drove_alone, "Gas car, drove alone") and we can map colors to the internal representation while the display mode still gets displayed? I'm just a bit confused as to what the process with that would look like, so any help would be appreciated @shankari .

I think we can get around this pretty easily by having max_adjustment be a parameter to dedupe_colors. This way the phone can keep using [0.4, 1.6], while the public dashboard can use something like [1, 2] or whatever you determine to be a good tradeoff of distinguishability between colors and legibility of text.

Thats a great idea, I like the idea of being able to customize it as needed. I just put out a PR with those changes. I added an additional parameter called min_adjustment because using only max_adjustment was making the colors either way too dark if the number was too low, or way too white if it was too high. As there was no happy medium, I just added that so we can keep the min and max independent. I think I also made it so that the functionality remains the same without passing in anything. Let me know if that was what you had in mind @JGreenlee .

JGreenlee commented 2 weeks ago

Looks good; I tweaked it a little bit, updated the tests to reflect the new parameter, and merged just now

louisg1337 commented 2 weeks ago

For the time being, I've pushed out a working version of the color mapping that does not utilize label-options.default.json. It seems like if we want to use it, it will require a lot more work refactoring the way we store and pass around the data to be displayed on the plots. That effort will probably also deserve its own branch as it doesn't seem like a quick fix. I'll continue exploring ways in which we can move away from the display representation and more towards the internal representation, but for the time being we have a working version using the old mode_labels.csv file.

shankari commented 2 weeks ago

I guess where I am still confused is if the plots showcase display modes as the labels, don't we need a display mode to color mapping for the colors to get applied correctly? The error I have been getting is that certain keys don't exist in the color mapping because the are slightly different in label-options.default.json, so I assumed that was how it needed to be done.

Here's what I would suggest at a high-level. To get deeper into this, I'd have to go and dig into the actual code.

Right now, my recollection is that the codebase works like this:

we construct a dataframe with a mode_confirm column. The values in mode_confirm are the internal labels (drove_alone)
the mode_confirm is mapped to a display mode column (Mode_confirm) which has values like Gas car, drove alone
we then create colors based on Mode_confirm
we should create the colors based on mode_confirm instead
if the concern is the strings in the legend, we can just create the legend manually using matplotlib (e.g. https://stackoverflow.com/questions/39500265/how-to-manually-create-a-legend or similar)

iantei commented 2 weeks ago

In `scaffolding.py`, we have `mapping_labels()` which gives mapping from between display mode to display mode i.e.	base mode	display mode
drove_alone	Gas Car, drove alone
e_car_drove_alone	E-car, drove alone

Currently, we are using Dynamic labels:

mode_values = list(mapping_labels(dynamic_labels, "MODE").values()) if "MODE" in dynamic_labels else []
combined_mode_values = mode_values + replaced_mode_values + ['other']

Default mapping:

combined_mode_values = (list(OrderedDict.fromkeys(dic_re.values())) + ['Other'])

colors_mode = dict(zip(combined_mode_values, plt.cm.tab20.colors[:len(combined_mode_values)]))

If we use mapping_labels(dynamic_labels, "MODE").keys() or dic_re.keys() instead of .values(). This would help us map between base mode and color palette.

louisg1337 commented 2 weeks ago

Thank you @shankari and @iantei for the help, I appreciate it a lot! I managed to refactor plot_and_text_stacked_bar_chart so that it now generates plots using the internal representations, and then we remap the labels to showcase the display representations. I've changed all of the plots in generic_metrics.ipynb to work with the new refactored function. The changes are simply swapping "Mode_confirm" to mode_confirm, and then tagging along values_to_translations as a parameter to plot_and_text_stacked_bar_chart when necessary. Here are what the plots look like:

new

new2

Further Tasks

Now that the function has been refactored and we are now using label-options.default.json, all thats left is...

Testing to make sure this new implementation works with dynamic_labels.
Testing other plot generating notebooks to see what needs to be refactored so this new color implementation works. The changes I made in generic_metrics.ipynb are a good example of what needs to be done to any notebook that has errors.

shankari commented 2 weeks ago

wrt https://github.com/e-mission/em-public-dashboard/pull/143#issuecomment-2322009427, as we have discussed earlier, the mapping (drove_alone -> Gas car, drove alone) shouldn't have two copies, one in the public dashboard and one in the UI. There should be one standardized mapping.

I agree with Jack that there should really be: (i) platform specified modes and (ii) user specified modes. I am willing to have: (i) platform specified default modes, (ii) platform specified custom modes, and (iii) user specified modes.

But I don't think we should have two copies of the platform specified default modes.

Abby-Wheelis commented 2 weeks ago

wrt https://github.com/e-mission/em-public-dashboard/pull/143#issuecomment-2322009427, as we have discussed earlier, the mapping (drove_alone -> Gas car, drove alone) shouldn't have two copies, one in the public dashboard and one in the UI. There should be one standardized mapping.

I agree, and I think it would be fairly easy to refactor the way mapping_dictionaries works to pull labels from elsewhere and remove the csvs. Does that change need to be a part of this PR? If it makes sense to do it separately I would be happy to take that on as a priority next week.

shankari commented 2 weeks ago

Happy to defer it until later. Ideally, at the end of this quarter, we should not have any csvs in this repo.

iantei commented 1 week ago

I cloned the above repo for further testing, and found the below cases where code changes need to be made:

Changes required:

Testing other plot generating notebooks to see what needs to be refactored so this new color implementation works. The changes I made in generic_metrics.ipynb are a good example of what needs to be done to any notebook that has errors.

Changes need to be incorporated to the mode_specific_metrics notebook


colors_mode, colors_purpose, colors_sensed = scaffolding.mapping_color_labels(dynamic_labels, dic_re, dic_pur)

TypeError: cannot unpack non-iterable coroutine object



2. While testing with `cortezebikes` dataset, I encountered an issue while generating Purpose related trip charts in `generic_metrics` notebook
<img width="973" alt="image" src="https://github.com/user-attachments/assets/e747b505-627f-4d1d-aa22-af358fb6fae6">

iantei commented 1 week ago

Tested the above scenario 2, with changes of integrated Inferred Labels in my another PR - with same dataset.

The generation of Number of trips for each purpose seems fine. Need to review the code changes which impacted this.

iantei commented 1 week ago

The issue is happening when it's trying to add Recreation/Exercise bar into the Stacked Bar Chart. It's able to plot for Other and Shopping. But throws the error when it's trying to plot for Recreation/Exercise.

iantei commented 1 week ago

Ah, the issue is happening because we are using display labels in expanded_ct.Trip_purpose as Recreation/Exercise while it is saved as exercise: Recreation / Exercise in label-options.default.json

iantei commented 1 week ago

Testing with dynamic labels:

Used cortez-ebikes as the dataset, while changing the STUDY_CONFIG=usaid-laos-ev and passing dynamic_labels from usaid-laos-ev-label-options

Got the following error:

KeyError                                  Traceback (most recent call last)
Cell In[4], line 1
----> 1 colors_mode, colors_purpose, colors_sensed, values_to_translations = await scaffolding.mapping_color_labels(dynamic_labels, dic_re, dic_pur)

File /usr/src/app/saved-notebooks/scaffolding.py:244, in mapping_color_labels(dynamic_labels, dic_re, dic_pur, language)
    241 values_to_translations = {mode["value"]: labels["translations"][language][mode["value"]] for mode in labels["MODE"]}
    243 # Assign colors to mode, purpose, and sensed values
--> 244 colors_mode = dedupe_colors([
    245     [mode, BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
    246     for mode in combined_mode_values
    247 ], adjustment_range=[1,1.8])
    248 colors_purpose = dict(zip(purpose_values, plt.cm.tab20.colors[:len(purpose_values)]))
    249 colors_sensed = dict(zip(sensed_values, [BASE_MODES[x.upper()]['color'] for x in sensed_values]))

File /usr/src/app/saved-notebooks/scaffolding.py:245, in <listcomp>(.0)
    241 values_to_translations = {mode["value"]: labels["translations"][language][mode["value"]] for mode in labels["MODE"]}
    243 # Assign colors to mode, purpose, and sensed values
    244 colors_mode = dedupe_colors([
--> 245     [mode, BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
    246     for mode in combined_mode_values
    247 ], adjustment_range=[1,1.8])
    248 colors_purpose = dict(zip(purpose_values, plt.cm.tab20.colors[:len(purpose_values)]))
    249 colors_sensed = dict(zip(sensed_values, [BASE_MODES[x.upper()]['color'] for x in sensed_values]))

KeyError: 'baseMode'

iantei commented 1 week ago

Issue: value_to_basemode = {mode["value"]: mode.get("base_mode", "baseMode") for mode in labels["MODE"]}

Whenever, we don't have base_mode as the key on the labels["MODE"], it selects baseMode value as the key instead of selecting the key of baseMode.

Solution: value_to_basemode = {mode["value"]: mode.get("base_mode", mode.get("baseMode", "baseMode")) for mode in labels["MODE"]}

This solves the issue with dynamic_labels.

e-mission / em-public-dashboard