Open louisg1337 opened 1 month ago
After a bit of time trying to find my way around the public dashboard, I managed to...
1) Get e-mission-common
installed for use by the visualization scripts on the dev environment.
2) Modify the mapping_color_labels
function to map combined_mode_values
and sensed_values
to the colors in e-mission-common
.
The way I mapped the colors was a bit weird as I ran into the following issue. In mapping_color_labels
, here is an example of the type of modes we are expected to map to colors:
combined_mode_values:
['Gas Car, drove alone', 'E-car, drove alone', 'Bus', 'Train', 'Free Shuttle', 'Taxi/Uber/Lyft', 'Gas Car, with others', 'E-car, with others', 'Bikeshare', 'Scooter share', 'E-bike', 'Walk', 'Skate board', 'Regular Bike', 'Not a Trip', 'No Travel', 'Same Mode', 'Other']
sensed_values:
['WALKING', 'BICYCLING', 'IN_VEHICLE', 'AIR_OR_HSR', 'UNKNOWN', 'OTHER', 'Other']
Here are the BASE_MODES
found in e-mission-common
that we will be using for the mapping:
While there is a clear mapping between values in sensed_values
and BASE_MODES
, not all values in combined_mode_values
have a clear mapping to BASE_MODES
(i.e. Gas Car, drove alone
vs CAR
, E-car, drove alone
vs E_CAR
). My solution to this was to trim the modes in combined_mode_values
to only keep the string on the left side of a comma (Gas Car, drove alone
--> Gas Car
), which got us closer to the value we cared about. I then used the difflib
package from python and called get_close_matches
, which scans BASE_MODES
to find the closest match, giving us the color we want. This was unfortunately a bit of a long winded way to do this, but I wasn't too sure what other way I could do this without hard coding cases.
Below is a plot generated using the new color mappings.
Note: As of right now I only tested the new colors with labeled data as that is what was provided using the data from the "loading test data" tutorial. I tried downloading my own json dumps for some sensed data, but we never fixed that json dump issue, so I wasn't able to. That could be a good thing to work on next after this.
1) Using my current mapping implementation, Regular Bike
maps to OTHER
instead of BICYCLING
. Going to need to figure out how to fix that, maybe I have to hardcode it because even Bike
--> BICYCLING
is a stretch.
2) Figure out how to install e-mission-common
into the production version, as I am not sure if the changes I made will reflect in the production environment. I am a bit confused about the difference between the production and dev environment as a whole, especially since the production docker compose file doesn't specify the need to run the Dockerfile
(viz_scripts/docker/Dockerfile) which has the pip install e-mission-common
in it, unlike how the dev docker compose does.
I managed to get this fix in, but I still have some questions wrt how the dashboard works.
1) How do I generate all the plots easily? Right now I've been running each notebook individually, but I keep on running into issues where example_function()
doesn't exist, so I have to find example_function()
, run that notebook first, and then I can run what I want. I just feel like there is an easier option that I am missing.
2) This is more out of curiosity, but how does all of this get setup for production? Is all of the software launched using the production docker-compose.yml
, like it is for development, and then hosted on a server somewhere? This whole setup is a super interesting way to tie multiple parts of e-mission
together in an elegant way.
@Abby-Wheelis for visibility and review
How do I generate all the plots easily?
If you are using the dev dockerfile, you need to run the notebooks in the notebooks to generate the plots (always run the mapping notebook first). You can also use the non-dev dockerfile, and then run the notebooks from the command line: there are some instructions here, or I'm happy to show you. You'll still need to run the mapping notebook first.
Figure out how to install e-mission-common into the production version
I think we already use e-mission-common
in the survey notebook, would you be able to use it here in a similar way?
For the matching process, some deployments have custom labels, can you check and make sure that your process will work with those labels too?
I met up with @Abby-Wheelis yesterday and she helped clarify a lot of the confusions I had with the public dashboard, along with giving me some great ideas on how to handle custom labels.
For the custom labels, instead of doing it in the hacky pattern matching way I previously used, her idea was to use the baseMode
field in the unprocessed dynamic_labels
variable for the mapping. This was a great idea as we can now be sure that no matter what custom label there is, the mapping will be correct. That change can now be found in my latest pushed commit.
I have only been able to test my code by hardcoding in the dynamic_labels
variable using values from
nrel-openpath-deploy-configs/label_options, and then manually checking the mapping to ensure the colors were matched correctly. Once we get the permissions issue cleared up with the mongo dumps I can try it using actual data and generate some plots to see how it looks.
I am still not quite happy with how I handle mapping the default labels to colors, but I don't know if there is a better way to do it. If there are no custom labels, we default to using the labels found in viz_scripts/auxillary_files/mode_labels.csv, which doesn't include a baseMode
column. This is why I did my hacky pattern matching solution as there was no clear mapping between the labels provided and the baseModes
we have. The only alternative I can think of is to add a baseModes
column into mode_labels.csv
, but that seems even worse. I'll keep it how it is for the time being as it works, but it could be worth refactoring if we figure out a better way to do it.
In terms of utilizing e-mission-common
in the dashboard, Abby filled me in on the fact that e-mission-server
actually installs it as a dependency, so I got rid of my code that installs it directly on the dashboard. I did notice, however, that e-mission-server
is using e-mission-common@0.5.3
, which does not include the base mode features as that was released in 0.5.4
. So before these changes go out we need to bump that up in e-mission-server
. I can make a PR for that, but I haven't worked on the server before so I am not sure how to test the server to make sure the newer release doesn't break it.
I was finally able to test out the new colors using some better mongo dumps. Below are how all of the plots in generic_metrics.ipynb
look using datasets from the Laos and Ride 2 Own studies. I chose those two datasets so we can check to make sure both the dynamic and default labels appear correctly. Let me know what you think about these @Abby-Wheelis and if you'd like to see any others!
@louisg1337 The labels look like they're working, so that's good!
Is there anything we can do to make the variations on the same base mode different from one another? In the in-app dashboard I think something like "gas car with others" would show up as a lighter/darker shade of the same red as "gas car drove alone"
@louisg1337
The only alternative I can think of is to add a
baseModes
column intomode_labels.csv
, but that seems even worse. I'll keep it how it is for the time being as it works, but it could be worth refactoring if we figure out a better way to do it.
I had a discussion with @JGreenlee yesterday. In regards with the aspect of Default Labels
not having label_options
, and it's dependency on using mode_labels.csv
. Would it be a good idea to host a default label_options
in the e-mission-commons
repo itself? With that we can make use of richModes
/baseModes
without having to rely on either transforming the mode_labels.csv
or the using current approach which you've suggested might need some refactoring.
I did notice, however, that e-mission-server is using e-mission-common@0.5.3, which does not include the base mode features as that was released in 0.5.4
We could make the following changes in the local system, in the environment36.dashboard.additions.yml
file in Public Dashboard to make use of the latest changes on the e-mission-common
repo.
dependencies:
- pip:
...
- git+https://github.com/JGreenlee/e-mission-common@master
Then re-build the docker with docker-compose -f docker-compose.dev.yml build
I just added https://github.com/JGreenlee/e-mission-common/blob/master/src/emcommon/resources/label-options.default.json to e-mission-common, which should help with this
The .csv files will phased out
Is there anything we can do to make the variations on the same base mode different from one another? In the in-app dashboard I think something like "gas car with others" would show up as a lighter/darker shade of the same red as "gas car drove alone"
Yes, we made dedupColors
for this purpose on e-mission-phone https://github.com/e-mission/e-mission-phone/blob/90595e9d2a5cdb4acbd47e8e4f224610f56c832d/www/js/components/charting.ts#L165
Ideally we want this to be moved to e-mission-common
so both dashboards use the same dedup method.
But the current dedupColors
depends on the JS color
library; we need a Python equivalent
It seems like Python's built-in colorsys
will be good enough for this
https://stackoverflow.com/a/60562502/5110347
So we won't need an extra dependency
I had a discussion with @JGreenlee yesterday. In regards with the aspect of Default Labels not having label_options, and it's dependency on using mode_labels.csv. Would it be a good idea to host a default label_options in the e-mission-commons repo itself? With that we can make use of richModes/baseModes without having to rely on either transforming the mode_labels.csv or the using current approach which you've suggested might need some refactoring.
Yes, this helps tremendously, thanks for suggesting this change! This definitely helps streamline the code and makes it way more consistent. Thank you also @JGreenlee for also implementing it, that'll be the next thing I try to work in.
We could make the following changes in the local system, in the environment36.dashboard.additions.yml file in Public Dashboard to make use of the latest changes on the e-mission-common repo.
Perfect, that sounds like a great way to deal with that issue!
Yes, we made dedupColors for this purpose on e-mission-phone https://github.com/e-mission/e-mission-phone/blob/90595e9d2a5cdb4acbd47e8e4f224610f56c832d/www/js/components/charting.ts#L165
So I was actually working on that this morning, I just pushed out a working version that allows us to lighten each occurrence of the same color. Here is one plot that showcases how it looks.
I definitely do agree though that this should be included in e-mission-common
, maybe we can use the code I just pushed out as a baseline to work off of as I also utilized the colorsys
package. Also, I do think my implementation will need to be refactored slightly, as I saw in dedupColors
it supports infinite occurrences of the same color, while mine only supports 5 shades lighter.
Looks great! We can unify the dedupe methods at a later time
Also, I do think my implementation will need to be refactored slightly, as I saw in dedupColors it supports infinite occurrences of the same color, while mine only supports 5 shades lighter.
Until we unify this, I figured I'd try to fix the above issue. I took a slightly different approach from the dedupColors
, which is instead of evenly spacing out an adjustment per color, I based the adjustment off of an inverse exponential decay function. I decided to do it this way for a few reasons.
1) It was just an easier change to make than to have to refactor everything which I would've had to do if I took the same approach found in dedupColors
.
2) The function can handle an infinite amount of occurrences as it will always stay within the max adjustment value set.
Like stated above, this will be refactored so none of this is set in stone, its just a placeholder for the time being.
We could make the following changes in the local system, in the environment36.dashboard.additions.yml file in Public Dashboard to make use of the latest changes on the e-mission-common repo.
I just implemented this in the latest push, and I've run into an issue. When I try to run the python notebooks, I get this error:
---> 73 async def get_egrid_region(coords: list[float, float], year: int) -> str | None:
74 """
75 Get the eGRID region at the given coordinates in the year.
76 """
77 if year < 2018:
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
The error is apparently because the str | None
syntax is only available in Python 3.10 (StackOverflow). Unfortunately, we are using Python 3.9 in this docker environment. I'm not sure if the solution to this is to replace the syntax in e-mission-common
or to bump up the version of Python. If it is the latter, I am also not quite sure how to do that. My impression is that Python 3.9 gets installed using the e-mission-server
docker image, and if that is the case thats a whole other can of worms to deal with trying to bump up the version there. Would love to hear some feedback of what you guys think!
I think it just needs from __future__ import annotations
I should configure the test environment in e-mission-common to always use Python 3.9 so we can catch these kinds of issues sooner
if from __future__ import annotations
doesn't work, feel free to remove the type hints – they are nice to have but definitely not critical
Thanks for the help with trying to figure that all out @JGreenlee ! It turns out the from __future__ import annotations
doesn't work which is unfortunate. I created a PR here that removes the str | None
type hints from e-mission-common
, which ended up fixing the issue.
Here are also the plots generated with the new coloring logic I used. Let me know what you guys think about it and whether or not the differences between duplicate colors are noticeable enough. If it isn't enough, I can try to refactor it a bit to get more of a difference between colors.
In this new version, they look a bit too similar to me. I vaguely remember going through this same process on the phone, which is what led me to design dedupColors
the way I did
dedupColors
makes the colors as far apart as it can with respect to the number of duplicates there are. With 2-3 duplicates the colors are easily distinguishable; with 4-5+ duplicates they become harder to distinguish (but still distinct)
If you implement dedupe_colors
in Python you should be able to use it on the dashboard. Something like:
colors_mode = dedupe_colors([
[mode, translations_to_basemodes.get(mode, "UNKNOWN")['color']]
for mode in combined_mode_values
])
Thats a great point @JGreenlee. I definitely agree the colors are a bit too similar, so I took your advice and implemented dedupe_colors
in e-mission-common
. The PR for that can be found here. I'm not sure how we are going to convert the python code to javascript as we are using the colorsys python library, so I kept it as a draft until we figure that out. I attached some plots below to showcase how the new colors look, let me know what you guys think. If need be, I can always adjust the variance in lightness that we can go so we can get more drastic differences.
I just added https://github.com/JGreenlee/e-mission-common/blob/master/src/emcommon/resources/label-options.default.json to e-mission-common, which should help with this
The .csv files will phased out
The above label_options.default.json doesn't have "translations" object. We need the translations for mapping from MODE.value
to mode_clean
.
For example: MODE.value = e_car_drove_alone
which has baseMode
= "E_CAR"
to mode_clean = "E-car, drove alone"
.
Also, I think it would be a good idea to match all the values in mode_labels.csv, like train_and pilot e-bike
, which translates to "Train"
.
On the phone, when translations are not included in the label options, they are taken from the language JSON file https://github.com/e-mission/e-mission-phone/blob/adb7cf235ded412119581a1ef8bf656d0ec93855/www/i18n/en.json#L160 These are the text that gets shown to the user.
I don't think this lines up with mode_clean
, though
I don't understand where the mode_clean
values comes from. A mode like train_and pilot e-bike
appears to have been a custom label. So did someone manually map it to "Train" at some point?
I had thought that the public dashboard just lumped custom modes into "Other"
I realize I'm jumping in mid-process here, so please feel free to correct me if I've misinterpreted!
So did someone manually map it to "Train" at some point?
Yes, I think that is the case, I remember these being useful when I was processing the canbike CO data, so I'm guessing these date from the early stages of the dashboard, but all other custom labels get lumped into "Other", so I think we're safe to do that with all user-provided labels.
As for where they come from - that depends if we're talking about "default" or "custom" labels.
Default labels currently get read from csvs in: mapping_dictionaries.ipynb which are then used as parameters to the functions in scaffolding.py
that process the data (ie expanded_ct['Mode_confirm'] = expanded_ct['mode_confirm'].map(dic_re)
). It sounds like this is what we're looking to phase out in favor of a single source of default labels/label translations, which makes a lot of sense to me.
If there are custom labels for the deployment, we pass those in as dynamic_labels
and use them instead of the mapping dictionaries. If we wanted to centralize the labels, and get rid of our csvs, I think that would not be too hard to do, we could refactor mapping_dictionaries.py
.
The above label_options.default.json doesn't have "translations" object. We need the translations for mapping from MODE.value to mode_clean.
I see what you mean, for dynamic labels we expect the translation objects, and for the default labels they're hardcoded into the csvs.
On the phone, when translations are not included in the label options, they are taken from the language JSON file
Can you remind me where this comes from? If it comes from e-mission-translate
then that's not something that we currently use on the dashboard, but could probably build in, we just don't offer any other parts of the dashboard in >1 language. Else maybe we could use the default files from nrel-openpath-deploy-configs
when a custom file is not specified? Or is the default going to be stored in em-common
and not nrel-openpath-deploy-configs
?
Can you remind me where this comes from?
For English, it comes from e-mission-phone
, www/i18n/en.json
, "multilabel"
. https://github.com/e-mission/e-mission-phone/blob/adb7cf235ded412119581a1ef8bf656d0ec93855/www/i18n/en.json#L160
The other languages are in e-mission-translate
.
I think we have a few options:
em-public-dashboard
retrieve en.json
from e-mission-phone
"multilabel"
translations to a new file (multilabel_en.json
?) and put it in e-mission-common
. The other languages would stay in e-mission-translate
"multilabel"
in en.json
, es.json
etc, always include translations in the label-options jsonen.json
to e-mission-translate
and have the public dashboard retrieve en.json
from thereAs for where they come from - that depends if we're talking about "default" or "custom" labels.
Can we get rid of the concept of "default" labels? I think there should be 2 kinds of labels: 1) labels specified by the deployment, and 2) labels given by the user
Can we get rid of the concept of "default" labels? I think there should be 2 kinds of labels: 1) labels specified by the deployment, and 2) labels given by the user
I agree that would make the most sense, but unless we update all of the configs, we're left with 1) deployments with their own file in nrel-openpath-deploy-configs/label_options
and 2) those who specify Mulitlabel
, but not their own labels (so we use our defaults)
Instead of having "multilabel" in en.json, es.json etc, always include translations in the label-options json
Could we use example-study-label-options.json
and example-program-label-options.json
as the files if the deployer does not specify their own? This would definitely streamline the process on the public dashboard end. These also currently include translations, does that mean we have translations in multiple places for the same set of labels?
Can we get rid of the concept of "default" labels? I think there should be 2 kinds of labels: 1) labels specified by the deployment, and 2) labels given by the user
This would definitely be ideal and is probably how we should do it, but Abby is right with how we may still need "default" labels for any past study that never specified any.
Could we use example-study-label-options.json and example-program-label-options.json as the files if the deployer does not specify their own?
I think this would be a great idea as having everything together in one file just makes things 1) a lot easier to work with and 2) a lot easier for any future developer to understand instead of having to track down multiple files. It will also follow the same format that we have for specifying dynamic labels, so that would be good to be consistent across the board.
These also currently include translations, does that mean we have translations in multiple places for the same set of labels?
Yes, and avoiding this repetition was the rationale for supporting translations from en.json, es.json, etc, rather than always putting all translations in the label-options JSON. We wanted to centralize all the translations as much as possible, this way future deployers would only have to provide translations for modes that we have not already supported in the past.
Could we use example-study-label-options.json and example-program-label-options.json as the files if the deployer does not specify their own?
Yes, this seems functionally the same as adding the translations to label-options.default.json. If we do this, there is no reason to still have them in en.json, es.json, etc.
I think the 'example' label options are intended as more of a resource for deployers to reference when creating their own label options, rather than to be used as an actual default.
But maybe we should just decide between "default" and "example" and use one for both.
How does this approach sound?
Add "translations" to label-options.default.json in e-mission-common, taking them from en.json, es.json, and lo.json
Remove example-study-label-options.json and example-program-label-options.json (they are already out of date and we don't want to maintain multiple things). In the README of nrel-openpath-deploy-configs, include a link to label-options.default.json, describing how deployers can use it as an example
When reading a deployer-provided label options on the phone or public dashboard, check for translations 1) in that same file's "translations"; 2) in label-options.default.json "translations"
- Add "translations" to label-options.default.json in e-mission-common, taking them from en.json, es.json, and lo.json
Is there any reason why we would want to host the label-options.default.json in e-mission-common instead of nrel-openpath-deploy-configs's label_options? We already have other label_options files in nrel-openpath-deploy-configs. It would likely be a good idea to store all label_options, even default one, in a single place.
For the meantime, I've created a PR for e-mission-common
which includes adding in the translations to label options. I just wanted to get this in there so I can start working on getting that implemented in case this is the approach we go for.
Is there any reason why we would want to host the label-options.default.json in e-mission-common instead of nrel-openpath-deploy-configs's label_options?
The only reason I can think of is that resources from nrel-openpath-deploy-configs
get downloaded at runtime, while resources from e-mission-common
can be included as a static dependency.
Keeping it in e-mission-common
avoids an extra network call.
The only reason I can think of is that resources from
nrel-openpath-deploy-configs
get downloaded at runtime, while resources frome-mission-common
can be included as a static dependency. Keeping it ine-mission-common
avoids an extra network call.
I see. We are doing the network call to download label_options whenever we have custom labels for a program. So, it's a tradeoff between keeping all the label_options at a single place vs an extra network call.
I've integrated the use of label_options.default.json into the code, and I am now running into an issue unfortunately. When I try to generate the new plots, I get this error:
I found out this is because some of the translation values in label_options.default.json don't line up with the values being used to store the data. Therefore, when matplotlib
tries to map keys to colors, it throws errors. The differences between the data are subtle too, here are some of them.
There are a few more instances of this, especially with the E-car
labels. Would it be alright if we changed the translation values in label_options.default.json to make them consistent with what was there before @JGreenlee ?
With the new dedupe_color( )
in e-mission-common
, the colors are appearing too dark. Here is what one of the plots look like.
Gas Car, drove alone
and E-car, drove alone
are both way too dark. I think that max_adjustment
needs to be tweaked a bit so that the range goes from like [0.4, 1.6] to something like [0.8, 2]. Going less than 1 makes the colors dark pretty fast, so I don't know if we should venture that far below 1. And then 2 was the max I used in my previous implementation, and it seemed to be a sweet spot where the color was light, but not too light where it was white. Let me know what you think about both of these @JGreenlee.
I found out this is because some of the translation values in label_options.default.json don't line up with the values being used to store the data. Therefore, when matplotlib tries to map keys to colors, it throws errors. The differences between the data are subtle too, here are some of them.
The fundamental error here is that the colors are mapped to the display modes. They should be mapped from the internal mode (e.g. drove_alone
, not Gas Car, drove alone
). The base mode is the internal representation and is the one that should be used for all calculations. It was a bad design to do the mapping from the display mode in the first place.
What if we want to support multiple languages in the dashboard as well?
Both of these issues are just growing pains from trying to unify code between phone & dashboard.
The label options need to be what they are in label-options.default.json
. These translations will be used on the phone for the text that gets displayed to the user, so we can't just change it to match the previous implementation of the public dashboard.
The dedupe_color
issues are because the charts on the phone don't have any text on the bars, while the dashboard does have (black) text on the bars. So the dashboard cannot afford to darken colors without compromising legibility, but the phone can.
I think we can get around this pretty easily by having max_adjustment
be a parameter to dedupe_colors
. This way the phone can keep using [0.4, 1.6]
, while the public dashboard can use something like [1, 2]
or whatever you determine to be a good tradeoff of distinguishability between colors and legibility of text.
You can make it an optional parameter with a default value of [1, 2] if you'd like the default behavior to be only "lightening" and not "darkening"
The fundamental error here is that the colors are mapped to the display modes. They should be mapped from the internal mode (e.g. drove_alone, not Gas Car, drove alone). The base mode is the internal representation and is the one that should be used for all calculations. It was a bad design to do the mapping from the display mode in the first place.
This makes sense and thats a great point, mapping it to the internal representation would be better, especially for supporting future translations. I guess where I am still confused is if the plots showcase display modes as the labels, don't we need a display mode to color mapping for the colors to get applied correctly? The error I have been getting is that certain keys don't exist in the color mapping because the are slightly different in label-options.default.json
, so I assumed that was how it needed to be done.
Is there a different way to map the colors to the labels in matplot
that I've been overlooking? Maybe something like the label values are stored as (drove_alone, "Gas car, drove alone")
and we can map colors to the internal representation while the display mode still gets displayed? I'm just a bit confused as to what the process with that would look like, so any help would be appreciated @shankari .
I think we can get around this pretty easily by having max_adjustment be a parameter to dedupe_colors. This way the phone can keep using [0.4, 1.6], while the public dashboard can use something like [1, 2] or whatever you determine to be a good tradeoff of distinguishability between colors and legibility of text.
Thats a great idea, I like the idea of being able to customize it as needed. I just put out a PR with those changes. I added an additional parameter called min_adjustment
because using only max_adjustment
was making the colors either way too dark if the number was too low, or way too white if it was too high. As there was no happy medium, I just added that so we can keep the min and max independent. I think I also made it so that the functionality remains the same without passing in anything. Let me know if that was what you had in mind @JGreenlee .
Looks good; I tweaked it a little bit, updated the tests to reflect the new parameter, and merged just now
For the time being, I've pushed out a working version of the color mapping that does not utilize label-options.default.json
. It seems like if we want to use it, it will require a lot more work refactoring the way we store and pass around the data to be displayed on the plots. That effort will probably also deserve its own branch as it doesn't seem like a quick fix. I'll continue exploring ways in which we can move away from the display representation and more towards the internal representation, but for the time being we have a working version using the old mode_labels.csv
file.
I guess where I am still confused is if the plots showcase display modes as the labels, don't we need a display mode to color mapping for the colors to get applied correctly? The error I have been getting is that certain keys don't exist in the color mapping because the are slightly different in label-options.default.json, so I assumed that was how it needed to be done.
Here's what I would suggest at a high-level. To get deeper into this, I'd have to go and dig into the actual code.
Right now, my recollection is that the codebase works like this:
mode_confirm
column. The values in mode_confirm
are the internal labels (drove_alone
)mode_confirm
is mapped to a display mode column (Mode_confirm
) which has values like Gas car, drove alone
Mode_confirm
mode_confirm
insteadIn scaffolding.py , we have mapping_labels() which gives mapping from between display mode to display mode i.e. |
base mode | display mode |
---|---|---|
drove_alone | Gas Car, drove alone | |
e_car_drove_alone | E-car, drove alone |
Currently, we are using Dynamic labels:
mode_values = list(mapping_labels(dynamic_labels, "MODE").values()) if "MODE" in dynamic_labels else []
combined_mode_values = mode_values + replaced_mode_values + ['other']
Default mapping:
combined_mode_values = (list(OrderedDict.fromkeys(dic_re.values())) + ['Other'])
colors_mode = dict(zip(combined_mode_values, plt.cm.tab20.colors[:len(combined_mode_values)]))
If we use mapping_labels(dynamic_labels, "MODE").keys()
or dic_re.keys()
instead of .values()
. This would help us map between base mode and color palette.
Thank you @shankari and @iantei for the help, I appreciate it a lot! I managed to refactor plot_and_text_stacked_bar_chart
so that it now generates plots using the internal representations, and then we remap the labels to showcase the display representations. I've changed all of the plots in generic_metrics.ipynb
to work with the new refactored function. The changes are simply swapping "Mode_confirm"
to mode_confirm
, and then tagging along values_to_translations
as a parameter to plot_and_text_stacked_bar_chart
when necessary. Here are what the plots look like:
Now that the function has been refactored and we are now using label-options.default.json
, all thats left is...
dynamic_labels
.generic_metrics.ipynb
are a good example of what needs to be done to any notebook that has errors. wrt https://github.com/e-mission/em-public-dashboard/pull/143#issuecomment-2322009427, as we have discussed earlier, the mapping (drove_alone
-> Gas car, drove alone
) shouldn't have two copies, one in the public dashboard and one in the UI. There should be one standardized mapping.
I agree with Jack that there should really be: (i) platform specified modes and (ii) user specified modes. I am willing to have: (i) platform specified default modes, (ii) platform specified custom modes, and (iii) user specified modes.
But I don't think we should have two copies of the platform specified default modes.
wrt https://github.com/e-mission/em-public-dashboard/pull/143#issuecomment-2322009427, as we have discussed earlier, the mapping (drove_alone -> Gas car, drove alone) shouldn't have two copies, one in the public dashboard and one in the UI. There should be one standardized mapping.
I agree, and I think it would be fairly easy to refactor the way mapping_dictionaries
works to pull labels from elsewhere and remove the csvs. Does that change need to be a part of this PR? If it makes sense to do it separately I would be happy to take that on as a priority next week.
Happy to defer it until later. Ideally, at the end of this quarter, we should not have any csvs in this repo.
I cloned the above repo for further testing, and found the below cases where code changes need to be made:
Changes required:
Testing other plot generating notebooks to see what needs to be refactored so this new color implementation works. The changes I made in generic_metrics.ipynb are a good example of what needs to be done to any notebook that has errors.
mode_specific_metrics
notebook
colors_mode, colors_purpose, colors_sensed = scaffolding.mapping_color_labels(dynamic_labels, dic_re, dic_pur)
TypeError: cannot unpack non-iterable coroutine object
2. While testing with `cortezebikes` dataset, I encountered an issue while generating Purpose related trip charts in `generic_metrics` notebook
<img width="973" alt="image" src="https://github.com/user-attachments/assets/e747b505-627f-4d1d-aa22-af358fb6fae6">
Tested the above scenario 2, with changes of integrated Inferred Labels in my another PR - with same dataset.
The generation of Number of trips for each purpose seems fine. Need to review the code changes which impacted this.
The issue is happening when it's trying to add Recreation/Exercise
bar into the Stacked Bar Chart.
It's able to plot for Other
and Shopping
. But throws the error when it's trying to plot for Recreation/Exercise
.
Ah, the issue is happening because we are using display labels in expanded_ct.Trip_purpose
as Recreation/Exercise
while it is saved as exercise: Recreation / Exercise
in label-options.default.json
Testing with dynamic labels:
Used cortez-ebikes
as the dataset, while changing the STUDY_CONFIG=usaid-laos-ev
and passing dynamic_labels from usaid-laos-ev-label-options
Got the following error:
KeyError Traceback (most recent call last)
Cell In[4], line 1
----> 1 colors_mode, colors_purpose, colors_sensed, values_to_translations = await scaffolding.mapping_color_labels(dynamic_labels, dic_re, dic_pur)
File /usr/src/app/saved-notebooks/scaffolding.py:244, in mapping_color_labels(dynamic_labels, dic_re, dic_pur, language)
241 values_to_translations = {mode["value"]: labels["translations"][language][mode["value"]] for mode in labels["MODE"]}
243 # Assign colors to mode, purpose, and sensed values
--> 244 colors_mode = dedupe_colors([
245 [mode, BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
246 for mode in combined_mode_values
247 ], adjustment_range=[1,1.8])
248 colors_purpose = dict(zip(purpose_values, plt.cm.tab20.colors[:len(purpose_values)]))
249 colors_sensed = dict(zip(sensed_values, [BASE_MODES[x.upper()]['color'] for x in sensed_values]))
File /usr/src/app/saved-notebooks/scaffolding.py:245, in <listcomp>(.0)
241 values_to_translations = {mode["value"]: labels["translations"][language][mode["value"]] for mode in labels["MODE"]}
243 # Assign colors to mode, purpose, and sensed values
244 colors_mode = dedupe_colors([
--> 245 [mode, BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
246 for mode in combined_mode_values
247 ], adjustment_range=[1,1.8])
248 colors_purpose = dict(zip(purpose_values, plt.cm.tab20.colors[:len(purpose_values)]))
249 colors_sensed = dict(zip(sensed_values, [BASE_MODES[x.upper()]['color'] for x in sensed_values]))
KeyError: 'baseMode'
Issue:
value_to_basemode = {mode["value"]: mode.get("base_mode", "baseMode") for mode in labels["MODE"]}
Whenever, we don't have base_mode
as the key on the labels["MODE"], it selects baseMode
value as the key instead of selecting the key of baseMode.
Solution:
value_to_basemode = {mode["value"]: mode.get("base_mode", mode.get("baseMode", "baseMode")) for mode in labels["MODE"]}
This solves the issue with dynamic_labels
.
Added in support for
e-mission-common
intoem-public-dashboard
, and then utilizede-mission-common
's base mode colors in the dashboard to unify colors across platforms.