Closed iantei closed 6 months ago
Unable to generate
Bar chart of Number of trips for each purpose (selected by users).
Reason: Number of trips is 0. Participant_with_at_least_one_labeled_trip is 0. Participants_with_at_least_one_trip is 0. Registered_participants is 15. Trips_with_at_least_one_label is 0. Trips_with_mode_confirm_label is 0. Trips_with_trip_purpose_label is 0. month is 11. year is 2020.
After
<html>
<body>
<h2>Unable to generate
Bar chart of Number of trips for each purpose (selected by users). Reason:</h2>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>value</th>
</tr>
</thead>
<tbody>
<tr>
<th>Number of trips</th>
<td>0</td>
</tr>
<tr>
<th>Participant_with_at_least_one_labeled_trip</th>
<td>0</td>
</tr>
<tr>
<th>Participants_with_at_least_one_trip</th>
<td>0</td>
</tr>
<tr>
<th>Registered_participants</th>
<td>15</td>
</tr>
...
Screenshot:
The solution to have the legend items side by side is to use nCols: https://stackoverflow.com/a/54870776
@Abby-Wheelis I'm thinking that we might want to keep the legend for the regular bar charts in the same place, but just increase the number of columns. For surveys, we may want to put the legend below the bar (to accommodate longer options) but still have only one column. We could specify that as an option to the plot function.
Thoughts?
Here's the option with 'lower right'
Here's the option with 'lower left' and the anchor at (1,0) which is not too bad
Here's the option with the ncols
Since this will likely change as we include survey results, and then the inferred results, I am going to punt on this. I will leave in the n_cols calculation, but switch to lower left
and (1,0)
to unblock us now.
We can't use pandas' barh because stacked doesn't see to work easily
>>> expanded_ct.groupby("Mode_confirm").agg({distance_col: 'count'}).reset_index().set_axis(["label", "value"], axis="columns").plot.barh(x="label", y="value", stacked=True)
generates
In lieu of investigating this further, let's just try to fix the gray value. This is displayed to avoid RuntimeError: Unknown return type
I tried to set the xlim to 100, because we know that it will never be more than that. However, 100, 100.01 and 100.1 and 101 all returned the same error. Leaving this for the pandas investigation...
One final fix, because the quality text annoys me and is confusing IMHO. Saying "For Labeled & Sensed: Based on X confirmed trips from Y users of X' trips from Y' users" is technically correct but requires cogitive load to determine which set of numbers goes with which bar, and what is labeled and what is confirmed. Going to try to fix that before declaring that this is done.
The quality text is updated in the inferred changes! That puts it on each bar, where it says something like "Confirmed trips, 100 trips from 10 users 15%" or similar like this: comment - I agree that this quality text is confusing, I like what ended up on the inferred trips more, I think we had decided to hold off on that change in this PR to limit the changes here - but that was a while ago when the timeline wasn't as crunched
I don't want to change the base quality_text methods since they are used in other plots as well. Let's hack together a method to pull out only the fields that we want for now. The correct fix would be to return the quality text as a tuple and format it in the notebook, which is also why one should not put presentation layer logic in the libraries.
Used some regular expression magic hacking to pull these out. I now have
While testing this with include_test_users=True
I also found a regression in the scaffolding code.
def get_quality_text_sensed(df, cutoff_text="", include_test_users=False):
now has a cutoff_text
argument. However, the invocation from scaffolding was still
quality_text = get_quality_text_sensed(expanded_ct, include_test_users)
So the boolean True
was interpreted as the cutoff_text and so the text that was generated was "Based on 2728 trips (True) from 13 users".
After fixing that, this option seems to work.
Made the appropriate changes to all the plots, although I kept some single-bar plots unchanged
Note that I had to remove the y-axis label "Trip Types" because: a) it didn't add anything beyond what the axis label already had b) it was overlapping with some of the label text, particular while including testers
Also does not crash for missing data
I am going to declare that this PR is done, at least to the extent that I am able to spend time on it at this time. Unless @iantei or @Abby-Wheelis has any objections, I plan to merge and push a release to staging
https://github.com/e-mission/em-public-dashboard/pull/128#issuecomment-2043336596 does look cool! We should revisit the quality text construction code when we implement inferred modes. I am not super happy with that code.
I have now tested with three different configurations:
All run without errors. @iantei has already tested the core code extensively before. Since I only restructured the code, I have tested that my changes generate the same results as his. I think this is ready to merge.
I have been debating a bit on whether to squash merge this or just merge it with a merge commit. Squash merging will have a cleaner merge history, but will also result in a signficantly more complex giant PR and will also lose some important back and forth around the discussions in the structure.
Let's go ahead and non-squash merge (we didn't squash merge for the UI rewrite either). But we should really be more careful about the plethora of commits in the future.