Closed misaugstad closed 6 years ago
For the user retention curve, we should have the following milestones (x-axis): 1,8,13,14,15
For tutorial only analysis, we should have each stage on the x-axis e.g. all steps for labeling the first curb ramp is the first stage, all steps for missing curb ramp (including zooming in) is the second stage and so on. So we would have 7 stages (corresponding to the 7 labels) and the 8th stage would be to take a step and do the rest to finish the tutorial.
For the user retention curve, we should have the following milestones (x-axis): 1,8,13,14,15
I'm assuming the retention curve can be across multiple sessions, right? It is just that the user completed 2 mission at some point, not that they did it in that first session, correct?
For tutorial only analysis, we should have each stage on the x-axis e.g. all steps for labeling the first curb ramp is the first stage
Just to clarify, by 8 stages you mean 8 steps on the x-axis; you don't meant to differentiate between the different steps within each stage. Did I read that correctly?
@manaswis for missions completed, do we want to do anything about the fact that the length of a first mission changed partway through? We could either ignore that there is a difference (in computing results, you can always talk about it in the paper of course), only look data after initial missions were switched to 500 feet, or analyze the two separately.
I'm assuming the retention curve can be across multiple sessions, right? It is just that the user completed 2 mission at some point, not that they did it in that first session, correct?
I was thinking more for a new user when they start using the system starting from tutorial, when do they drop-off?
Points 14, 15, 16, 17 could be for returning users (i.e. they have already done the tutorial) - when do they stop working?
Just to clarify, by 8 stages you mean 8 steps on the x-axis; you don't meant to differentiate between the different steps within each stage. Did I read that correctly?
Yes. Each of the 8 stages would be on the x-axis, where each stage consists of multiple steps as part of the stage (e.g. for marking the first curb ramp, steps would be: place a label, select a severity etc.) -- the individual steps for each stage won't be on the x-axis.
@manaswis should I do this strictly by IP address maybe, considering we don't really have a good way to connect pre-signup interactions to user id post signup? I mean, ip address is a reasonable proxy for a user, whether they are registered or not!
@manaswis for missions completed, do we want to do anything about the fact that the length of a first mission changed partway through? We could either ignore that there is a difference (in computing results, you can always talk about it in the paper of course), only look data after initial missions were switched to 500 feet, or analyze the two separately.
Hmm, I think we should get the results for each separately -- so then we could check see if there were any differences between the timeline with first mission as 500ft vs 1000ft. So we should be able to talk about it separately if we do see differences, else we talk about it as a general first mission (without talking about the distance covered).
Hmm, I think we should get the results for each separately
Sounds good.
the individual steps for each stage won't be on the x-axis.
Sounds good.
I was thinking more for a new user when they start using the system starting from tutorial, when do they drop-off?
I still don't know if this means only their first session or across multiple sessions :)
I still don't know if this means only their first session or across multiple sessions :)
First session as a new user when they start using the tool by going through the tutorial first.
Okay, and do you think it sounds good to just base this off of IP address?
For anonymous users, that's the only way we have. This should be done for registered users as well (there are users who create accounts first then start using the tool).
On Thu, May 3, 2018 at 2:44 PM, Mikey Saugstad notifications@github.com wrote:
Okay, and do you think it sounds good to just base this off of IP address?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-data-analysis/issues/4#issuecomment-386446781, or mute the thread https://github.com/notifications/unsubscribe-auth/ACvXgAqt8gTZh_8_HaSrS3xULA2P9TFkks5tu3okgaJpZM4TrlXa .
-- Best Regards, Manaswi Saha Ph.D. Student Paul G. Allen School of Computer Science & Engineering University of Washington, Seattle homes.cs.washington.edu/~manaswi http://homes.cs.washington.edu/~manaswi Twitter - @manaswisaha https://twitter.com/manaswisaha
I'm saying that by ignoring registration of users, if we just do it based on IP address, shouldn't that pretty accurately cover registered users as well? Like IP address is a reasonable proxy for a user, whether they are signed in or not! And we don't have an easy way to link registered users to the auditing they did before registering anyway.
Oh I see. I think so it should be fine.
This would be a line graph or bar chart that shows the percentage of users remaining on the y-axis, and the x-axis would be a series of milestones including loading the audit page, finishing the tutorial, finishing an audit task, finishing a mission, etc.
@jonfroehlich and @manaswis let's try to nail down which stats we really want... Here is an initial sequence to work off of. Feel free to delete or add to it. I'm going to start with way more than we will want, so you can pick which ones to remove. And starting at number 12, the options are no longer sequential. I would also like input on how we want to deal with that.