Closed shankari closed 3 years ago
to give some additional context: e-mission is currently written in angular1 also called angularJS on the internet. It is also a reactive framework in which you make changes to $scope
variables in a controller
which are automatically updated in the HTML.
So concretely, if I want to display a username, I can have <h3>{{username}}</h3>
in the HTML, set $scope.username = "Jenna";
in the javascript controller and it will show up. It also has some additional tags to make the UI coding easier, so for example, (syntax may be a bit off, high level workflow only)
<ion-list ng-repeat="trip in tripList">
<ion-item>{{trip.start_fmt_time}} -> {{trip.end_fmt_time}}</ion-item>
</ion-list>
then if the javascript has $scope.tripList = [{start_fmt_time: "...", end_fmt_time: "..."}, {...}, {...}, ...]
then the list will show all the trips.
If you have built the native version of the app (the second set of instructions on the README) you should be able to open this app in your IDE.
and on your phone if you like
If you have built the UI-only change version of the app (first set of instructions in the README), you need to install the e-mission-devapp in an emulator and then connect to the live-reloading dev server. Please submit a PR to clarify this in the README.
The previous attempts at creating a graph of common trips are generally in https://github.com/e-mission/e-mission-server/tree/master/emission/analysis/modelling/tour_model
The original goal was to create a graph of people's regular patterns, similar to
We called it a tour model because that is apparently what such models are called in the travel behavior literature.
The data representation (I think) https://github.com/e-mission/e-mission-server/blob/master/emission/analysis/modelling/tour_model/tour_model.py which is a classic graph data structure. Not sure if that is the correct data structure we need to use now.
This has some wrapper objects: https://github.com/e-mission/e-mission-server/blob/master/emission/analysis/modelling/tour_model/tour_model_matrix.py
And this is how the matrix was created: https://github.com/e-mission/e-mission-server/blob/master/emission/analysis/modelling/tour_model/create_tour_model_matrix.py
Note that this is not incremental, and we will almost certainly need incremental creation in order to avoid overwhelming the server over a two year data collection period.
On the phone, this data was retrieved and displayed in the UI shown above.
The related UI code is at: https://github.com/e-mission/e-mission-phone/tree/master/www/js/common
we just read it from the local cache, we saved to the cache in the OUTPUT_GEN
step of the pipeline
https://github.com/e-mission/e-mission-server/blob/master/emission/net/usercache/builtin_usercache_handler.py#L204
Ideas:
group people by how regular their travel patterns are (what % of trips are novel)
walking the dog
going to friends' house
other examples
Critical from energy standpoint is mode, replaced_mode or distance Prompt every day upto once a week
Hypothetical: use same route while walking dog
Minimum user interaction would just set the labels for the trips on the server as part of the algorithm
More user interaction would involve confirmation from the user (taking a sample of trips that were labeled by algorithm, display to user for one-item feedback). This can be another way of quickly getting user confirmation for groups of trips without having to label each one, but without relying wholly on inference.
if the algorithm was wrong, and user re-labels, next round will re-cluster. no need to ask users for additional input using a micro-survey etc.
As few opportunities for open response as possible; reduce the other category as much as possible. walking the dog: walk, recreation/exercise, no travel
drove_alone
, shared_ride
problem: cannot distinguish between them. Only factor is through user feedbacksuggestion: both drove_alone
and shared_ride
in cluster. Look at the proportion of labels and assign to new trips accordingly. So over the course of a month or so, the energy impact will be accurate.
Need to balance user input and accuracy.
But would be then show these to the user for confirmation as well? Show intermittently: "if people used to carpool but got tired of their travel companions and stopped, need to know" show this maybe once a month. Tricky because assigning from a distribution means that individual trips won't be accurate, but also asking how many times do you think you carpooled in the past month is subject is recall bias.
Ask for a week's worth of data every few months. This might be good not just for these mixed clusters but for all trips in general. Go back to primary data collection mode every few weeks.
When we are in "secondary data collection mode" will we still ask for confirmation of automatically assigned labels? Yes, to some extent until we gain confidence in the algorithm assessment.
for analysis alone, we want to assign labels based on the clusters that @corinne-hcr has built so far and evaluate accuracy. This does not appear to be a very heavy lift. As part of system integration, however, we need to use those clusters as a model. In other words, we need to:
We need to come up with a data structure to store the clusters. This might be as simple as a list of lists e.g.
{cluster1: [tripid1, tripid2, tripid3,...],
cluster2: [tripid4, tripid5, tripid10,...],
}
When should we update this model? We will create it after an initial intensive "primary data collection period" but then we will go to "secondary data collection" for a few months in which we ask for confirmation for trips but not full trip labeling. Do we assume that people will confirm and correct the majority of their trips during the secondary data collection period?
This might be a question to ask @andyduvall to weigh in on.
If we rebuild the model every week, we may want to consider incremental updates in which we only add new trips to existing clusters, but that might be hard to do. We may want to wait on that and treat it as a future performance enhancement.
As we get new unlabeled trips, we need to access the stored cluster model, and assign labels according to the existing clusters. This will likely involve a more complicated data structure because just storing the trip ids may not be enough to figure out which cluster a new trip should fall into. We may need to store some cluster level attributes like distance/duration cutoffs so we can efficiently determine how to match the trip.
This does need to be incremental as in we cannot recluster every time we label a new trip. We must use stored clusters. This is because we get ~ 5-10 trips a day per person and the cluster algorithm can be space and time intensive. I turned off the tour model earlier because it was too slow (need to get some numbers). Note also that @corinne-hcr cannot run two versions of the pipeline side by side while restructuring on her laptop. And rebuilding the model every trip will result in rebuilding multiple time a day for each user.
Note also that there is a latency issue with taking too long - if the user looks at the diary and we haven't run the inference yet, it will look like there were no labels and the algorithm is "not working".
System design of labeling trips:
INFER_LABELS
) that works with the current set of trips and fills in the labels)infer_labels
module + method which will read in the stored model and match the incoming trip to it and return the labels
infer_labels.infer_labels
infer_labels.infer_label(trip)
, e.g. https://github.com/e-mission/e-mission-server/blob/gis-based-mode-detection/emission/analysis/classification/inference/mode/rule_engine.py#L109)confirmed_trip
object (in emission/core/wrapper
) to support this.Note that we may want to use different clusterings from different folds, generate labels for each clustering and determine the confidence of non-mixed clusters (not mixed shared_ride
and carpool
) depending on whether all the models return the same labels (e.g. even for the "work" part of the example we have been using all along).
Current "label screen" javascript is here https://github.com/e-mission/e-mission-phone/blob/master/www/js/diary/infinite_scroll_list.js
In particular, note that Timeline.readAllConfirmedTrips(currEnd, ONE_WEEK).then((ctList)
should return all confirmed trips for the last week, and if you print them out, or use a debugger to view ctList
, you should be able to see your new fields.
Here's where we embed that data structure in the UI: https://github.com/e-mission/e-mission-phone/blob/master/www/templates/diary/infinite_scroll_list.html
notably <div ng-repeat="input in userInputDetails" class={{input.width}} style="text-align: center;" ng-attr-id="{{ 'userinput' + input.name">
iterates through all the user inputs and displays them in {{trip.userInput[input.name].text}}
@GabrielKS, can you put in your high level design into this issue? @PatGendre, who is deploying the platform in La Rochelle is interested. @asiripanich may be too. As deployers, as opposed to end-users, they are more likely to give feedback on the configuration options 😄
@PatGendre @asiripanich @shankari I've attached the second draft of my trip label inference system UI proposal in PDF and Word form; I've pasted the executive summary below.
UI Draft 2.pdf UI Draft 2.docx
Behind the scenes, a server-side trip inference algorithm will produce a data structure that comprises, for each trip, a list of label sets with probabilities. This allows a client-side algorithm to refine the inference as the user manually confirms or corrects labels; it will also aid in analysis when the user has not confirmed or corrected labels. Below is an example of what this data structure might look like.
To integrate the inference system into the phone app user interface, we will make three main changes. First, we display unfilled labels in red, inferred labels in yellow, and manually filled-out or verified labels in green. Second, we add a To Label view on the Label screen that displays only trips that users are expected to manually provide input on, according to the criteria below. Third, we change when we notify users as detailed below. We will also make some smaller UI changes to improve usability, including adding a confirm button, a feature to label many of the same type of trip at once, and a map of trips. Below is a screenshot of progress so far, showing the red/yellow/green color scheme and the confirm button.
Expectations and notification behavior will be highly configurable to support many use cases. The basic idea of this configuration is that for each label category, comprising red labels and yellow labels with varying degrees of certainty, trip administrators will be able to select an expectation setting and a notification setting. The user could be expected to label every trip in a given category, none of them, or some random sample in between. The user could be notified after every trip in a given category, at the end of the day, or less frequently.
Study administrators will also be able to configure a “primary mode,” in which user input expectations are high, and a “secondary mode,” in which less is demanded of the user; the app can automatically cycle between primary and secondary modes according to a configurable schedule.
Please see the full proposal for a complete configuration example and many more details. Please post or send me any feedback you may have!
@jf87 not sure if you are planning to use labels
@GabrielKS thanks a lot for the explanations :-)
Here a few questions and thoughts :
@PatGendre Thanks for the thoughts; I'll do my best to respond given my limited knowledge of the e-mission system outside of the areas I'm directly working with:
mode_confirm=unknown
, you would just have entries for all the modes with equal probabilities, so I suppose it is a special case in a sense — and that's what I'd expect @corinne-hcr's analysis algorithm will do. I'll note that it's rare for us to truly have no idea what the mode is, if we look at all the data — if the user is moving quickly, we can at least assign walking a lower probability.drove_alone
and 70% are shared_ride
). This is one area where @andyduvall's primary/secondary data collection mode idea could be useful: we ask the user to label all their trips during the primary mode to get a sense of the frequencies and then allow those labels to remain yellow during the secondary mode.@PatGendre to summarize: the current feature works on trip-level labels. We currently have trip-level user labels, but even after we switch to section-level user input for the mode, we will still want to have trip-level purpose labels. So this is not dependent on merging the A-Mission code.
If/when we do have section-level user inputs, we can expand this functionality to the section level as well.
@shankari @GabrielKS thanks for your detailed answers, it is pretty clear now :-) and definitely a very promising feature!
Tracking initial staging deployment at: https://github.com/corinne-hcr/e-mission-server/pull/2
@PatGendre @jf87 @asiripanich Initial deployment complete, in beta testing.
Picking tuning parameters is being tracked at: https://github.com/e-mission/e-mission-docs/issues/656
Most recent set of UI changes, which can give you a sense of how the feature works, is being tracked at: https://github.com/e-mission/e-mission-phone/pull/772
@GabrielKS I am writing down all the potential issues for next week here so (a) I don't forget them and (b) we can prioritize them.
Please let me know if there's anything else in your notes that I am missing.
UI + expectations:
Trip matching and modeling:
High-level end to end feature:
A staging user claimed that the All Unlabeled tab was not scrolled to the bottom when they switched to it. I have been unable to reproduce this issue, but it might bear some more investigation.
I wonder if this is the same as https://github.com/e-mission/e-mission-docs/issues/658 That user says that it happens after they finish a trip; the trips for that day usually show up in the middle.
Another usability question: What happens if you click the green checkmark and not all of them are auto-labeled? Some of them are red. I don't want to click it and mess something up.
Another usability question: What happens if you click the green checkmark and not all of them are auto-labeled? Some of them are red. I don't want to click it and mess something up.
The confirm button iterates through all label types and, for each, populates the user input with the client-computed inference iff
for (const inputType of ConfirmHelper.INPUTS) {
const inferred = trip.finalInference[inputType];
// TODO: figure out what to do with "other". For now, do not verify.
if (inferred && !trip.userInput[inputType] && inferred != "other") $scope.store(inputType, inferred, false);
}
So the explanation I've been giving to users, "the confirm button turns yellow labels green" is exactly correct; red labels are simply ignored.
However, I've just realized that the confidence threshold the client has been using is not the one in the config file — the config file is on the server side and I never wrote the code to send it over — it's been using a placeholder value of 0.5. This explains why we've been seeing so many trips with mixed yellow and red labels! I will fix that ASAP. (This affects both what is displayed and what is confirmable as the logic only happens once, so at least it's consistent.)
// Display a label as red if its most probable inferred value has a probability of less than or equal to confidenceThreshold
// TODO: make this configurable
const confidenceThreshold = 0.5;
// [...]
// Apply threshold
if (max.p <= confidenceThreshold) max.labelValue = undefined;
The confirm button iterates through all label types and, for each, populates the user input with the client-computed inference iff
I know, and I told the user verbally. But wanted to record their initial response for you to think about UX improvements.
Ah, okay. Didn't get that that was the user talking.
@GabrielKS Another issue that came up with at least two users today: they expected to see the auto-labeling in the diary and when it didn't show up, they thought that the auto-labeling didn't work. One of the users went to the diary after they saw that there were no trips in "To Label", even at the end of the week 😦
This sounds deceptively easy, but is actually going to require a fair amount of rewriting, because the diary retrieves data using a completely different API call than the label screen. I had thought about merging the diary and label screens, but was hoping to not tackle that just yet.
Maybe we can do a minimal rewrite in which we retrieve the confirmed trips in addition to the user labels as currently retrieved. Then we can read the inferred labels from there.
@GabrielKS Jeanne also brought up displaying only recent trips to users. Unfortunately, instead of only showing trips from the last n days, she wants to start with a clean slate on 16th Aug and then display all trips. Given that requirement, I think that maybe the easiest option is to just hardcode that into the client instead of putting it into the server since it doesn't seem super general. We would not cherry-pick that change into master, obviously.
Filed https://github.com/e-mission/e-mission-docs/issues/662 to track the UI issues that came up today.
Filed https://github.com/e-mission/e-mission-docs/issues/663 to track the "confidence too high" issue from https://github.com/e-mission/e-mission-eval-private-data/pull/28#issuecomment-894704661
Closing this for now. We can track any pending problems in separate issues.
This tracks the tasks required to actually close the loop on user interaction with labels.