e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

Common trip system building #647

Closed shankari closed 3 years ago

shankari commented 3 years ago

This tracks the tasks required to actually close the loop on user interaction with labels.

shankari commented 3 years ago

to give some additional context: e-mission is currently written in angular1 also called angularJS on the internet. It is also a reactive framework in which you make changes to $scope variables in a controller which are automatically updated in the HTML.

So concretely, if I want to display a username, I can have <h3>{{username}}</h3> in the HTML, set $scope.username = "Jenna"; in the javascript controller and it will show up. It also has some additional tags to make the UI coding easier, so for example, (syntax may be a bit off, high level workflow only)

<ion-list ng-repeat="trip in tripList">
    <ion-item>{{trip.start_fmt_time}} -> {{trip.end_fmt_time}}</ion-item>
</ion-list>

then if the javascript has $scope.tripList = [{start_fmt_time: "...", end_fmt_time: "..."}, {...}, {...}, ...] then the list will show all the trips.

shankari commented 3 years ago

If you have built the native version of the app (the second set of instructions on the README) you should be able to open this app in your IDE.

https://github.com/e-mission/e-mission-docs/blob/4fad0c4dd82893529bdc808fc5a8329f98ea7367/docs/dev/front/how_to_test_changes%20_to_a_plugin.md#2-then-open-the-project-in-your-ide

and on your phone if you like

https://codewithchris.com/deploy-your-app-on-an-iphone/

shankari commented 3 years ago

If you have built the UI-only change version of the app (first set of instructions in the README), you need to install the e-mission-devapp in an emulator and then connect to the live-reloading dev server. Please submit a PR to clarify this in the README.

shankari commented 3 years ago

The previous attempts at creating a graph of common trips are generally in https://github.com/e-mission/e-mission-server/tree/master/emission/analysis/modelling/tour_model

The original goal was to create a graph of people's regular patterns, similar to tour_model

We called it a tour model because that is apparently what such models are called in the travel behavior literature.

shankari commented 3 years ago

The data representation (I think) https://github.com/e-mission/e-mission-server/blob/master/emission/analysis/modelling/tour_model/tour_model.py which is a classic graph data structure. Not sure if that is the correct data structure we need to use now.

This has some wrapper objects: https://github.com/e-mission/e-mission-server/blob/master/emission/analysis/modelling/tour_model/tour_model_matrix.py

And this is how the matrix was created: https://github.com/e-mission/e-mission-server/blob/master/emission/analysis/modelling/tour_model/create_tour_model_matrix.py

Note that this is not incremental, and we will almost certainly need incremental creation in order to avoid overwhelming the server over a two year data collection period.

shankari commented 3 years ago

On the phone, this data was retrieved and displayed in the UI shown above.

The related UI code is at: https://github.com/e-mission/e-mission-phone/tree/master/www/js/common

we just read it from the local cache, we saved to the cache in the OUTPUT_GEN step of the pipeline https://github.com/e-mission/e-mission-server/blob/master/emission/net/usercache/builtin_usercache_handler.py#L204

shankari commented 3 years ago

Ideas:

Critical from energy standpoint is mode, replaced_mode or distance Prompt every day upto once a week

As few opportunities for open response as possible; reduce the other category as much as possible. walking the dog: walk, recreation/exercise, no travel

suggestion: both drove_alone and shared_ride in cluster. Look at the proportion of labels and assign to new trips accordingly. So over the course of a month or so, the energy impact will be accurate. Need to balance user input and accuracy.

But would be then show these to the user for confirmation as well? Show intermittently: "if people used to carpool but got tired of their travel companions and stopped, need to know" show this maybe once a month. Tricky because assigning from a distribution means that individual trips won't be accurate, but also asking how many times do you think you carpooled in the past month is subject is recall bias.

Ask for a week's worth of data every few months. This might be good not just for these mixed clusters but for all trips in general. Go back to primary data collection mode every few weeks.

When we are in "secondary data collection mode" will we still ask for confirmation of automatically assigned labels? Yes, to some extent until we gain confidence in the algorithm assessment.

shankari commented 3 years ago

for analysis alone, we want to assign labels based on the clusters that @corinne-hcr has built so far and evaluate accuracy. This does not appear to be a very heavy lift. As part of system integration, however, we need to use those clusters as a model. In other words, we need to:

Store the clusters

We need to come up with a data structure to store the clusters. This might be as simple as a list of lists e.g.

{cluster1: [tripid1, tripid2, tripid3,...],
 cluster2: [tripid4, tripid5, tripid10,...],
}

Update the clusters

When should we update this model? We will create it after an initial intensive "primary data collection period" but then we will go to "secondary data collection" for a few months in which we ask for confirmation for trips but not full trip labeling. Do we assume that people will confirm and correct the majority of their trips during the secondary data collection period?

This might be a question to ask @andyduvall to weigh in on.

If we rebuild the model every week, we may want to consider incremental updates in which we only add new trips to existing clusters, but that might be hard to do. We may want to wait on that and treat it as a future performance enhancement.

Apply the model

As we get new unlabeled trips, we need to access the stored cluster model, and assign labels according to the existing clusters. This will likely involve a more complicated data structure because just storing the trip ids may not be enough to figure out which cluster a new trip should fall into. We may need to store some cluster level attributes like distance/duration cutoffs so we can efficiently determine how to match the trip.

This does need to be incremental as in we cannot recluster every time we label a new trip. We must use stored clusters. This is because we get ~ 5-10 trips a day per person and the cluster algorithm can be space and time intensive. I turned off the tour model earlier because it was too slow (need to get some numbers). Note also that @corinne-hcr cannot run two versions of the pipeline side by side while restructuring on her laptop. And rebuilding the model every trip will result in rebuilding multiple time a day for each user.

Note also that there is a latency issue with taking too long - if the user looks at the diary and we haven't run the inference yet, it will look like there were no labels and the algorithm is "not working".

shankari commented 3 years ago

System design of labeling trips:

shankari commented 3 years ago

Note that we may want to use different clusterings from different folds, generate labels for each clustering and determine the confidence of non-mixed clusters (not mixed shared_ride and carpool) depending on whether all the models return the same labels (e.g. even for the "work" part of the example we have been using all along).

shankari commented 3 years ago

Current "label screen" javascript is here https://github.com/e-mission/e-mission-phone/blob/master/www/js/diary/infinite_scroll_list.js

In particular, note that Timeline.readAllConfirmedTrips(currEnd, ONE_WEEK).then((ctList) should return all confirmed trips for the last week, and if you print them out, or use a debugger to view ctList, you should be able to see your new fields.

Here's where we embed that data structure in the UI: https://github.com/e-mission/e-mission-phone/blob/master/www/templates/diary/infinite_scroll_list.html

notably <div ng-repeat="input in userInputDetails" class={{input.width}} style="text-align: center;" ng-attr-id="{{ 'userinput' + input.name"> iterates through all the user inputs and displays them in {{trip.userInput[input.name].text}}

shankari commented 3 years ago

@GabrielKS, can you put in your high level design into this issue? @PatGendre, who is deploying the platform in La Rochelle is interested. @asiripanich may be too. As deployers, as opposed to end-users, they are more likely to give feedback on the configuration options 😄

GabrielKS commented 3 years ago

@PatGendre @asiripanich @shankari I've attached the second draft of my trip label inference system UI proposal in PDF and Word form; I've pasted the executive summary below.

UI Draft 2.pdf UI Draft 2.docx

Behind the scenes, a server-side trip inference algorithm will produce a data structure that comprises, for each trip, a list of label sets with probabilities. This allows a client-side algorithm to refine the inference as the user manually confirms or corrects labels; it will also aid in analysis when the user has not confirmed or corrected labels. Below is an example of what this data structure might look like. Inference Data Structure

To integrate the inference system into the phone app user interface, we will make three main changes. First, we display unfilled labels in red, inferred labels in yellow, and manually filled-out or verified labels in green. Second, we add a To Label view on the Label screen that displays only trips that users are expected to manually provide input on, according to the criteria below. Third, we change when we notify users as detailed below. We will also make some smaller UI changes to improve usability, including adding a confirm button, a feature to label many of the same type of trip at once, and a map of trips. Below is a screenshot of progress so far, showing the red/yellow/green color scheme and the confirm button. UI Progress So Far

Expectations and notification behavior will be highly configurable to support many use cases. The basic idea of this configuration is that for each label category, comprising red labels and yellow labels with varying degrees of certainty, trip administrators will be able to select an expectation setting and a notification setting. The user could be expected to label every trip in a given category, none of them, or some random sample in between. The user could be notified after every trip in a given category, at the end of the day, or less frequently.

Study administrators will also be able to configure a “primary mode,” in which user input expectations are high, and a “secondary mode,” in which less is demanded of the user; the app can automatically cycle between primary and secondary modes according to a configurable schedule.

Please see the full proposal for a complete configuration example and many more details. Please post or send me any feedback you may have!

UI Draft 2.pdf UI Draft 2.docx

shankari commented 3 years ago

@jf87 not sure if you are planning to use labels

PatGendre commented 3 years ago

@GabrielKS thanks a lot for the explanations :-)

Here a few questions and thoughts :

GabrielKS commented 3 years ago

@PatGendre Thanks for the thoughts; I'll do my best to respond given my limited knowledge of the e-mission system outside of the areas I'm directly working with:

  1. Yes, the label inference algorithm is a stage in the pipeline. I've prototyped this with a placeholder algorithm here. @corinne-hcr will be working on the actual algorithm.
  2. My work is intended to reduce the user input burden, so wherever we have the user inputting labels, that's where the inferences will be — and currently that's at the trip level, even for mode. Currently the plan is to use @corinne-hcr's clustering algorithms, plus maybe the section-level mode detection, to produce a trip-level mode inference. If in the future we allow users to label trip sections, then it will make sense to do section-level mode inference. I'm in support of the change to section-level user-confirmed mode labels, but I'm not fully aware of what it would entail.
  3. Again, I won't be using the section-level mode detection data directly. My data structure does allow for an unknown mode — you would not literally assign mode_confirm=unknown, you would just have entries for all the modes with equal probabilities, so I suppose it is a special case in a sense — and that's what I'd expect @corinne-hcr's analysis algorithm will do. I'll note that it's rare for us to truly have no idea what the mode is, if we look at all the data — if the user is moving quickly, we can at least assign walking a lower probability.
  4. I agree that there will be some labels that will be hard to infer even if/when we combine the section-level mode detection with the clustering algorithms. In the UI proposal I give the example of "Pattern 1," an office worker who sometimes drives individually and sometimes carpools to work. In this instance, I think the best we can do is to assign probabilities based on how frequently each mode occurs in the user's manually labeled data and either ask the user for confirmation or just allocate trips according to those probabilities for analysis afterwards (if the user drives individually 30% of the time and carpools 70% of the time, we can pretend behind the scenes that 30% of the yellow mode labels are drove_alone and 70% are shared_ride). This is one area where @andyduvall's primary/secondary data collection mode idea could be useful: we ask the user to label all their trips during the primary mode to get a sense of the frequencies and then allow those labels to remain yellow during the secondary mode.
shankari commented 3 years ago

@PatGendre to summarize: the current feature works on trip-level labels. We currently have trip-level user labels, but even after we switch to section-level user input for the mode, we will still want to have trip-level purpose labels. So this is not dependent on merging the A-Mission code.

If/when we do have section-level user inputs, we can expand this functionality to the section level as well.

PatGendre commented 3 years ago

@shankari @GabrielKS thanks for your detailed answers, it is pretty clear now :-) and definitely a very promising feature!

shankari commented 3 years ago

Tracking initial staging deployment at: https://github.com/corinne-hcr/e-mission-server/pull/2

shankari commented 3 years ago

@PatGendre @jf87 @asiripanich Initial deployment complete, in beta testing.

Picking tuning parameters is being tracked at: https://github.com/e-mission/e-mission-docs/issues/656

Most recent set of UI changes, which can give you a sense of how the feature works, is being tracked at: https://github.com/e-mission/e-mission-phone/pull/772

shankari commented 3 years ago

@GabrielKS I am writing down all the potential issues for next week here so (a) I don't forget them and (b) we can prioritize them.

Please let me know if there's anything else in your notes that I am missing.

UI + expectations:

Trip matching and modeling:

High-level end to end feature:

GabrielKS commented 3 years ago

Additional things I have:

More urgent:

Less urgent:

Fun quotes:

shankari commented 3 years ago

A staging user claimed that the All Unlabeled tab was not scrolled to the bottom when they switched to it. I have been unable to reproduce this issue, but it might bear some more investigation.

I wonder if this is the same as https://github.com/e-mission/e-mission-docs/issues/658 That user says that it happens after they finish a trip; the trips for that day usually show up in the middle.

shankari commented 3 years ago

Another usability question: What happens if you click the green checkmark and not all of them are auto-labeled? Some of them are red. I don't want to click it and mess something up.

GabrielKS commented 3 years ago

Another usability question: What happens if you click the green checkmark and not all of them are auto-labeled? Some of them are red. I don't want to click it and mess something up.

The confirm button iterates through all label types and, for each, populates the user input with the client-computed inference iff

for (const inputType of ConfirmHelper.INPUTS) {
  const inferred = trip.finalInference[inputType];
  // TODO: figure out what to do with "other". For now, do not verify.
  if (inferred && !trip.userInput[inputType] && inferred != "other") $scope.store(inputType, inferred, false);
}

So the explanation I've been giving to users, "the confirm button turns yellow labels green" is exactly correct; red labels are simply ignored.

However, I've just realized that the confidence threshold the client has been using is not the one in the config file — the config file is on the server side and I never wrote the code to send it over — it's been using a placeholder value of 0.5. This explains why we've been seeing so many trips with mixed yellow and red labels! I will fix that ASAP. (This affects both what is displayed and what is confirmable as the logic only happens once, so at least it's consistent.)

// Display a label as red if its most probable inferred value has a probability of less than or equal to confidenceThreshold
// TODO: make this configurable
const confidenceThreshold = 0.5;
// [...]
// Apply threshold
if (max.p <= confidenceThreshold) max.labelValue = undefined;
shankari commented 3 years ago

The confirm button iterates through all label types and, for each, populates the user input with the client-computed inference iff

I know, and I told the user verbally. But wanted to record their initial response for you to think about UX improvements.

GabrielKS commented 3 years ago

Ah, okay. Didn't get that that was the user talking.

shankari commented 3 years ago

@GabrielKS Another issue that came up with at least two users today: they expected to see the auto-labeling in the diary and when it didn't show up, they thought that the auto-labeling didn't work. One of the users went to the diary after they saw that there were no trips in "To Label", even at the end of the week 😦

This sounds deceptively easy, but is actually going to require a fair amount of rewriting, because the diary retrieves data using a completely different API call than the label screen. I had thought about merging the diary and label screens, but was hoping to not tackle that just yet.

Maybe we can do a minimal rewrite in which we retrieve the confirmed trips in addition to the user labels as currently retrieved. Then we can read the inferred labels from there.

shankari commented 3 years ago

@GabrielKS Jeanne also brought up displaying only recent trips to users. Unfortunately, instead of only showing trips from the last n days, she wants to start with a clean slate on 16th Aug and then display all trips. Given that requirement, I think that maybe the easiest option is to just hardcode that into the client instead of putting it into the server since it doesn't seem super general. We would not cherry-pick that change into master, obviously.

shankari commented 3 years ago

Filed https://github.com/e-mission/e-mission-docs/issues/662 to track the UI issues that came up today.

shankari commented 3 years ago

Filed https://github.com/e-mission/e-mission-docs/issues/663 to track the "confidence too high" issue from https://github.com/e-mission/e-mission-eval-private-data/pull/28#issuecomment-894704661

shankari commented 3 years ago

Closing this for now. We can track any pending problems in separate issues.