Open Abby-Wheelis opened 2 months ago
what is "large enough proportion"?
My initial though of what would be a "large enough" proportion of user-entered labels might be "if they're big enough to not get grouped into other anyways", which I suppose could result in a lot of bars that are just bigger than our cutoff (2%), so we would probably want to do some experimenting.
We could also use the cutoff that for the numbers to be printed in the bar - which is currently 4%
what if the text entered by users is subtly different?
We can easily take care of things like capitalization and spaces, but other factors like spelling or subtle wording (roller skate vs roller blade) might get difficult to control for
Additional Cleanup required - the patches to prevent both "Other" and "OTHER" are messy
We do not have INVALID as BASE_MODES in e-mission-common base_modes.py's list of BASE_MODES. We will need to update the list with INVALID
into the list of BASE_MODES
as move to use BASE_MODES for mapping labels to color palette.
We do not have INVALID as BASE_MODES in e-mission-common base_modes.py's list of BASE_MODES. We will need to update the list with INVALID into the list of BASE_MODES as move to use BASE_MODES for mapping labels to color palette.
The patch added INVALID
as a sensed mode, not as a base mode, so I'm not sure we need to add it to the base modes, unless we discover a need for an INVALID
base mode as well as sensed mode
sensed_values = ["WALKING", "BICYCLING", "IN_VEHICLE", "AIR_OR_HSR", "UNKNOWN", "OTHER", "INVALID"]
Yes, INVALID
is added up as a sensed mode. However, we are either mapping internal labels to the base mode, or using the base mode as is with sensed_values. Therefore, we would need to add INVALID
as a base mode.
colors_sensed = dict(zip(sensed_values, [BASE_MODES[x.upper()]['color'] for x in sensed_values]))
Additional cleanup coming out of #148 related to fleet and survey deployments:
survey_metrics
and generic_metrics_sensed
- there does not seem to be a reason to replace the sensed mode with BLE sensed mode in two separate notebooks - I think we can add the new charts from survey_metrics
to generic_metrics_sensed
, since they do display sensed modes, and then not show them in the dashboard for label-based deployments, since those charts really just replace those in generic-metrics
148#discussion_r1766009942generic-metrics
) 148#issuecomment-2360045768More cleanup from #145
NaN
inferred values hereI am merging this for now, but note that there is a similar function in the server when we handle metrics to include trips with confidence > 90% as "confirmed". We do not and should not need to reinvent code.
Made some code changes:
ashrest2-41625s:em-public-dashboard ashrest2$ git diff viz_scripts/scaffolding.py
diff --git a/viz_scripts/scaffolding.py b/viz_scripts/scaffolding.py
index 880c819..a20b13c 100644
--- a/viz_scripts/scaffolding.py
+++ b/viz_scripts/scaffolding.py
@@ -7,6 +7,7 @@ from collections import OrderedDict
import difflib
import emission.storage.timeseries.abstract_timeseries as esta
+import emission.storage.decorations.trip_queries as esdt
import emission.storage.timeseries.tcquery as esttc
import emission.core.wrapper.localdate as ecwl
import emcommon.diary.base_modes as emcdb
@@ -239,8 +240,8 @@ async def load_viz_notebook_inferred_data(year, month, program, study_type, dyna
# Access database
tq = get_time_query(year, month)
participant_ct_df = load_all_participant_trips(program, tq, include_test_users)
- inferred_ct = filter_inferred_trips(participant_ct_df)
- expanded_it = expand_inferredlabels(inferred_ct)
+ inferred_ct = esdt.has_final_labels_df(participant_ct_df)
+ expanded_it = esdt.expand_finallabels(inferred_ct)
expanded_it = await map_trip_data(expanded_it, study_type, dynamic_labels, dic_re, dic_pur)
Result: | Old | Server-code implementation |
---|---|---|
There is a difference in the implementation of my inferred labels trips bars and the one in which server generates it.
Firstly the filtering logic:
Secondly, the expand_finalLabels logic:
@shankari Do you recommend following the server approach for the Labeled and Inferred bars in the Stacked Bar chart? I have used confidence_threshold to filter out after knowing there's no user_input, that's different than the approach of using expectation column for the trip.
While executing the server approach, I encountered the below issue for cortezebikes
, will look more into it:
There is no ['analysis/confirmed_trip']
, therefore there is no expectation column.
```
Running at 2024-09-25T04:56:02.857039+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-25T04:56:10.199455+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Traceback (most recent call last):
File "/usr/src/app/saved-notebooks/bin/generate_plots.py", line 111, in
Well, upon a further investigation. I found the following:
# This is a placeholder. TODO: implement the real algorithm
def _get_expectation_for_trip(trip):
raw_expectation = eace.get_expectation(trip)
# For now, expect always labeling unless the config file specifies no labeling at all
processed_expectation = not raw_expectation["type"] == "none"
return {"to_label": processed_expectation}
Seemingly, the algorithm for expectation is not implemented yet, therefore it makes sense why it was giving same result for both "Labeled" and "Labeled and Inferred Labels" bar.
@iantei
Seemingly, the algorithm for expectation is not implemented yet,
This is not correct. We do return a non-zero value, it is just not as sophisticated as we would like
While executing the server approach, I encountered the below issue for cortezebikes, will look more into it:
This is probably a data load error since if we have cleaned trips, we have confirmed trips
I have used confidence_threshold to filter out after knowing there's no user_input, that's different than the approach of using expectation column for the trip.
The functionality in the server code and your code is similar but not identical
As I said earlier
but note that there is a similar function in the server when we handle metrics to include trips with confidence > 90% as "confirmed"
The way in which the data is handled is similar but the filtering is different. You need to try handling the data in the same way but changing the filter at the end.
This is probably a data load error since if we have cleaned trips, we have confirmed trips
Yes, that's a data load error. I used the same has_final_labels_df(df)
with an additional check of
def has_final_labels_df(df):
if len(df) == 0:
return df
It worked fine
@shankari
dropna()
I have submitted the PR #158 for review. Please let me know if you wanted anything different.
@iantei please discuss with @Abby-Wheelis when she returns on Monday. The 90% threshold is the filter. Changing the public dashboard to use the 90% threshold will not move the needle on inferred trips.
Observed in the dataset that all trips have expectation: {to_label - True}
, this aligns with the comment in the server code where # For now, expect always labeling unless the config file specifies no labeling at all
where all will be expected to label.
Observed in testExpandFinalLabelsPandasFunctionsNestedPostFilter
that sample data does not expect labels if already labeled by user or if p value of a set of inferred labels is >= 0.9. However, we have not found in the code where the expectation to label is set to false if the trip is labeled or there is a p value greater than 0.9 present. Should we update the server code to change to_label
to false once user input or reasonable inferred labels are present? Did we overlook the code where this already exists?
Found relevant PR for expanding the labels here: #846 The commit which added the expectation to label was 3 years ago here
It seems like using the confidence_threshold
, like the phone does, or using the approach from the server seem like they should be separate approaches. It seems like ideally we would use the server approach, but the hangup with the expectation.to_label
makes that complicated.
Centralizing some cleanups that need to happen pending the unification of base mode colors:
baseMode
for "land trips" see this commentuser mode display comment