Open Abby-Wheelis opened 9 months ago
should we limit what is shown on the dashboard to only include questions with at least k responses? The data is not spatial, but there could be cases in which the surveys collect detailed information...
deferring this concern since we will be are reviewing survey questions before they are merged - process TBD
we can't know what questions are in the survey entries
pandas JSON normalize (what we use in the admin dashboard) can help us!
Starting work on this!
I will integrate with the conversion to bar charts when that finishes the review cycle, but for now am relying on pie chart code. Progress so far:
Example from washingtoncommons
study:
for "select multiple" questions, this is more difficult, as individual values are made up of multiple options, still need to address this still need to pull in file from config and github, right now I have the file locally
I have resolved both of these issues now, looping over all of the labels that will go in the chart to translate them, accounting for multiples - long lists are awkward for display, and I am pulling the files off of github instead of using the local file, doing this based off of the survey section of the config
Still need to see if this will work with multiple trip-level surveys, it does currently handle surveys with "missing" responses fine, so I think this covers the case of some questions not always being required
Still need to figure out how this will work from the frontend - if we don't have a defined list of chart names to pull from, how can we display? Here's what I'm thinking right now:
will also need "quality text" here, just remembered that part!!
Based on 50 trips with survey responses from 5 users, of 450 trips from 25 users
-- this could be tricky, since we probably want to count "num of trips with responses" and "total trips" as num of trips where this question was answered out of trips where this questions was prompted ... may start with "Based on 50 responses from 5 users` and build up from there
Two things, for which we have examples in various other pieces of code:
survey_id = df.user_input..apply(lambda sr: sr.substring...
df.group_by(survey_id)
So this will allow you to have a chart of the challenges using Fermata equipment by pulling in only the survey responses to EV car usage parking at Fermata location. Because that question won't even exist in the other surveys.
Asmita does something similar with the demographic surveys in the admin dashboard because we have had multiple iterations of the survey so far, at least on staging.
'public_dashboard_chart': "...fermata_response_col_name"
We also do this in the admin dashboard to filter out excluded columns
So we can start with just displaying all columns, but if there are some columns which questions that we don't want to make public we can filter them this way.
Let's say that we have a 100 trips, 40 are gas cars, 30 are EVs not at Fermata and 30 are EVs are Fermata. And they are all labeled. Without grouping, the "fermata challenges" chart will have 70% N/A. With grouping, you will only consider the 30% of the trips that have a relevant response and then display the breakdown.
Ok, I think I might still be confused, I have introduced a survey_name
column, to my dataframe of trips, so I would be able to count how many responses we have for each survey type, but I still don't know how many trips were presented with that survey. So I have enough information to say "based on 30 responses from 10 users" but not the "out of 50 total trips from 12 users" part. Am I missing something?
This is the code I added:
survey_trips = survey_trips.reset_index()
survey_trips['survey_name'] = survey_trips.user_input.apply(lambda sr: sr['trip_user_input']['data']['name'])
I could now easily break the dataframe into one for each survey_name
, but I still wouldn't know how many trips would have that survey assigned to them, just how many responses there were
that is an excellent point. I guess we will need to determine how many trips would have displayed each type of survey. If the eval
syntax that we use in the config is sufficiently general, we can use it as an input to the df.query
method and find the number of trips that the survey was applicable to. If it is not general enough, maybe we have two strings (one for javascript and one for python) or we have one string that we can convert using the tools in #1058
I think I will stick with some placeholder text for now then, and figure out a more thorough solution as we know more about the syntax that will end up in the config.
Starting a draft PR so code is public, I think my next goal is connecting the notebook to the frontend - then it will be a roughly working prototype and can be refined and tested with more survey configurations from there.
We're going to need plugins in order to read the spreadsheet files. At least for the notebook to translate the data, but if we name the files and create headers based on the survey questions in the frontend we will need to do something similar there.
I was testing with openpyxl
in the notebook, this worked well but I had to install it with conda so we'll need to add it to the production environment as well. Based on the release notes, openpyxl
has been around since 2013 and was last updated in March of 2023. The test coverage icon on the docs indicates 95% test coverage.
For the frontend, xlsx
was the first thing I found, but it sounds like it has migrated... so still looking for a good options for reading the file on the frontend. Once we can read the file we can extract the survey questions in both raw_data_format
and the Human Readable Format
Why do we have to read the spreadsheet? The spreadsheet is the input from our partners, but then we convert it to xml and to json. The spec version that we add in the dynamic config is the json (or maybe it can now be xml per @JGreenlee). json and/or xml should both be much better supported on both python and javascript than xls.
We won't necessarily have json anymore but we will have xml. I did not realize that xml would be easier to read than the spreadsheet, I will look into using that to get the raw/readable pairs instead!
I have now been able to read the questions in from the xml version of the survey in the notebook - still working on the responses.
Encountering a few things to keep in mind:
I have now been able to read the questions in from the xml version of the survey in the notebook - still working on the responses.
Encountering a few things to keep in mind:
- some questions come and go from the surveys, the only questions we can display "translated" and not with raw questions and responses are those still in the surveys, so I think we should only display those that are still in the survey
- for the likert scale questions, the "-" space saving display means that these would probably show up for example, as 12& - on the chart, I can come up with a workaround for this, but am trying to keep the code general, so still thinking of the best solution
On the 5-point scale Likert questions, the labels will be "Disagree", "-", "Neutral", "-", "Agree", but the underlying values corresponding to those options will simply be 1, 2, 3, 4, 5.
On the 5-point scale Likert questions, the labels will be "Disagree", "-", "Neutral", "-", "Agree", but the underlying values corresponding to those options will simply be 1, 2, 3, 4, 5.
Good to know! I'm trying to keep this handling as generic as possible, so my default would be to display the labels for every answer, to avoid displaying things like 'pick_up_drop_off_accompany_someone', which is what would happen if I showed the values for every answer, but maybe I can detect when a label is "nonsensical" and show the value instead?
Maybe use labels most of the time, but have an exception for questions where appearance="likert"
https://github.com/JGreenlee/nrel-openpath-deploy-configs/blob/a5eaa17a649c4ecfc2ce336c3c1543643c6435da/survey_resources/dfc-fermata/dfc-ev-return-trip-v0.xml#L58
I have now connected the notebook to the frontend, so the survey question charts can now be displayed when the page loads 🎉
There is still lots of cleanup and polishing to go before this is ready, however:
washingtoncommons
data to ensure generalityhandle "Other please specify" questions ... omit them? Get them paired to their parent question?
Currently, these questions will be shown right after their predecessor in the list, which semi-implies their definition. However, it could be clearer - I could check for the name "Other - please specify" and then prepend the previous question to the name, or I could just check for "Other" to be more general - but that seems less foolproof. Maybe we check for the whole string and let survey designers know about this feature?
I now have the "missing plots" and alt text working for each of the charts. A slight pivot I needed to make this morning was switching from generating a chart for each column in the dataset to generating a chart for each question in the survey file -- this means older survey questions will not be addressed, but also that all questions sought by the frontend (the ones in the file) will have a chart or debug df and alt text. Examples of washingtoncommons
and dfc-fermata
as they stand now:
Note that some of the longer questions and labels are causing squashing - I'm not too worried about that right now because we are going to be moving to the stacked bar charts when that PR gets merged.
I have made some progress on the "quality text" for these charts thanks to some conversations in our meeting yesterday:
@JGreenlee pointed out that I need the composite
and not just the confirmed
trip in order to get the sections, so I added a snippet to scaffolding
to pull that trip type. I also leveraged the eval
that @shankari suggested yesterday, which allowed me to define the variables accessed in that scope. The eval strings that I used for this test are different than the ones in the config in two ways:
dictionary.access
notation with dictionary['access']
notation to work well with Python's syntax&&
with and
and the operator !
with not
, but left !=
aloneI think it's possible to convert the dictionary into a format compatible with dot notation, but before continuing down that road I think we should decide if we want to continue with retrofitting the strings written for javascript, or just write one for python. Dot notation and replacing &&
should be fairly foolproof, but replacing !
and leaving !=
might be a little tricker.
Remaining tasks:
survey_info
parameter to method that calls notebooks so it is passed indfc-fermata
washingtoncommons
I think we should decide if we want to continue with retrofitting the strings written for javascript, or just write one for python.
What about supporting both? The Python scripts could check for a showsIfPy
condition first, then fallback to showsIf
if showsIfPy
doesn't exist.
That would give us flexibility to do it with one condition when possible, but define 2 conditions if we have to.
I had to replace
dictionary.access
notation withdictionary['access']
notation to work well with Python's syntax
We can update the config to use bracket notation since it's the common denominator between JS objects and Python dicts. Let me create a PR for that real quick.
replacing
!
and leaving!=
might be a little tricker.
It should just be a matter of finding the right regex expression
I think this would do it:
import re
expression = "sections[0]['sensed_mode_str'] != 'CAR' && !pointIsWithinBounds(end_loc['coordinates'], [[-105.153, 39.745], [-105.150, 39.743]])"
expression = expression.replace('&&', 'and')
expression = expression.replace('||', 'or')
expression = re.sub(r"!(?!=)", "not ", expression)
results in:
sections[0]['sensed_mode_str'] != 'CAR' and not pointIsWithinBounds(end_loc['coordinates'], [[-105.153, 39.745], [-105.150, 39.743]])
the regex expression !(?!=)
matches any !
that isn't followed by =
I think this would do it:
That worked perfectly! Thank you!
I now have more accurate quality_text
for each of the charts. For the current conditions with the data snapshot I have, it is showing 0 trips for ev return, 73 for ev roaming, and 169 for gas car - this seems about right for the snapshot. So the denominator of the quality text is now accurate on a per-survey level, but won't be 100% accurate to a per-question level. This is because some questions could show conditionally, so there could be cases were there were 15 instances that met the survey conditions, but only 5 that met the conditions for a specific question based on previous answers.
Since the conditions will be in the xml, it would be possible to determine the denominator dataframe for quality text on a per-question bases. For example, if the condition is that question 3 had to be answered "yes", I could pull all rows from the trip dataframe where the answer to question 3 is "yes". The technique for this would probably be similar to what I'm doing now for the other eval strings, but I'd have to piece these together dynamically based on the xml. I'm going to set this aside and make sure the base functionality is polished and working with both datasets, then I can come back to ironing out more details in the quality text.
So the denominator of the quality text is now accurate on a per-survey level, but won't be 100% accurate to a per-question level.
I have made a first attempt at this, and was somewhat successful in using the conditions scraped from the xml. However, when I talked with @JGreenlee this morning to ask about some of the other xml conditions, he pointed out that since all questions are usually required - if a conditionally shown question was shown - it was answered. I mapped this out a little bit and it would mean that the "denominator" of the quality text would work something like:
While true that a required conditional question would be answered 100% of the times that it is shown, I'm not convinced that this is useful information. If we were in a place to predict survey responses, then maybe we'd know how many of trips most likely should have seen this survey question, but that's not something we can do right now.
So, question for the group, is it better to have:
A. "too broad" of a denominator, just showing how many trips the question could have been relevant to, knowing that some of those trips may not have seen the question even if they responded to the survey given the conditional nature
--OR--
B. "narrow" denominator, where conditional questions seem to be answered 100% of the time, but only because they are only shown once someone is already answering the survey and will be required to if it is shown
@iantei @shankari - what do you think?
I just talked to @iantei and we're both leaning to option A from above, since B is not really useful information. We also discussed that finding a way to note that a question is conditional (or the precise condition) in the quality text would be useful. If this can be done easily, I could do it now, else it might be a good candidate for future work.
Did a bit of initial testing with the prod .yml configuration today, and things look promising! Sample output below:
Washington Commons default charts - only one survey, so consistent denominator:
DFC-Fermata default charts - multiple surveys, so multiple demominators:
Both datasets are also working with all of the "sensed charts", but most of these are subject to chance once I merge in the stacked bar chart changes. There was one error that I encountered with the "sensed trips under 80%" chart, since there was no data (all data is test, and included_test_users
is configured to False
) - I will keep an eye out for this error once the stacked bar chart changes are merged.
@Abby-Wheelis I agree that we should go with A. We can clarify this in the text to indicate that it is the number of surveys that were displayed instead of the number of questions. Effectively, conditional questions are not answered, it is just that they are not answered because they are not displayed.
Today I merged from upstream, picking up the unified color scheme changes! This is going to apply a little differently to the survey situation, though. We know we don't want the same color to repeat within a single chart, that there are probably too many different possible responses to just give every single one a different color, and it would be nice if the questions that DO have the same responses - ie likert-scale questions - had the same color mapping.
To achieve this:
My next step is try merging all of the answers into a single list and map colors based on that, to achieve "same answers have the same color" and prevent repeated colors in one chart!
Merging all of the answers into a single list seems to have worked!
This does not work for questions where users were able to select multiple responses, we'll need a workaround for that next
Initial solution for select multiples - averaging the colors together - I feel this is likely to result in a brown every time, so I think further experimentation is needed to see if this works all the time
Adjusted some colors today and tested with datasets, when testing with running the notebooks at the command line, I encountered an error that I have not been able to resolve:
I have not been able to replicate when running the notebooks interactively, so I'm not sure what's going on. Wrapping up work for the day, but documenting this here for when I start back tomorrow.
Picking the error back up this morning, I was able to replicate in the notebook:
Documenting current to-dos from previous PR's and things we've noticed on staging, to be addressed week of May 27th:
Couple of others:
for likert questions, use the related text ("neutral" or "strongly agree") instead of "3" or "5"
In order to do this, I will have to find a workaround for "2" and "4", if I just switched to use the text, then it would be "Strongly disagree", "-", "Neutral", "-", "Strongly agree" and we can't tell "agree" from "disagree". I had circumvented this issue by using the value instead of the label, but I can definitely explore other options.
if there are no labeled trips, but there are sensed trips, we get a blank chart because the exception is generated while pre-processing for the first bar (e.g. DDOT before we got the 4 labeled trips)
Mocked this up today by setting expanded_ct
to blank, then added the lambda function for aggregation, etc. to the list of parameters, to be called on the df in the plot function, this seems to work well
Next to carry this change through the rest of the code!
In the fleet version of OpenPATH, the metrics we would like to display on the dashboard are different than what is displayed when we have simple mode and purpose labels:
To access survey data in the public dashboard we will need to:
scaffolding.py
?Challenges:
In my mind, the trick here is going to be a data-driven approach where we are able to show data based soley on the data we have (and probably the survey part of the config) and allow for as many different survey questions as we can. Can we set it up so that all of the survey questions that are multiple choice in nature are able to be displayed on the dashboard?
I see the fleet adaptation taking two main steps: