Improve the charts on the admin interface

manaswisaha commented 8 years ago

Currently, these aren't very clear to the user and aren't useful. Things that can be improved:

improve the x-axis of each
create better histograms with proper bin sizes and not sort in a descending order (doesn't allow us to see a distribution)
each chart should have a clear goal (e.g. the onboarding completion time isn't providing any useful information)

Screenshots from my local system:

More improvements can be added when we get around to working on this.

jonfroehlich commented 7 years ago

This is going to be @jihyukbae's initial task to contribute to Project Sidewalk.

misaugstad commented 7 years ago

I'll be taking this on so that we have some more informative graphs and metrics to look at to assess how the relaunch is going. Here is my current plan (looking for any feedback/suggestions):

Changes to current graphs (see screenshots in the original issue comment):

All graphs: move the axis titles outside the graph DONE
Coverage Rate per Neighbourhood % graph: switch the x and y axes and keep the axis labels, as suggested in #412; give the option to switch between ordering by neighbourhood name or coverage. DONE
Coverage (m) graph: I'm not sure I see the value in this graph, and am thinking of removing it. I think that we care about completion percentage, not the miles covered per neighbourhood. If others see value in this graph, the same changes that are made to the coverage rate graph could be made to it. REMOVED
Completed Missions graph: I think that this graph is not very clear or useful... Again I think that we care more about coverage percentage than this. The x-axis has coverage distance for a neighbourhood, and the y-axis is counts of missions completed that resulted in that distance having been covered in that neighbourhood... I am not seeing this as useful, am I missing something here? REMOVED
Onboarding Completion Time graph: change x-axis labels to be in minutes instead of seconds, change bin size(s) to show more of a spread, with the last bin being something like (10+ minutes) DONE
Daily Label Counts graph (shown below): data is only shown for the past month; should we make this longer, set it to be day of relaunch to present, or some other time window? DONE
Daily Audit Counts graph (shown below): should treat this the same way that we treat the Daily Label Counts Graph, also should change to either Daily Missions or Daily Total Distance. Daily Missions could also be Weekly Missions instead... DONE

New graphs (some suggestions came from #351):

Line graph showing percentage of DC audited over time. This will give us an idea of trends in use of the tool and an at-a-glance sense of how on track we are to reaching 100% coverage. I want to be able to pan and zoom at least along the x axis (time) so that you can see the big trends over time (maybe seeing two big spikes, one for initial launch and another for relaunch) and have the ability to drill into more recent trends (maybe the server went down and the curve flattened out for a day or two). DONE
Histograms of severity rating counts (one graph for each label type, and one summed across all types). We don't currently know if it needs to be one a scale of 1-5. This could tell us whether users actually use all 5 ratings, or if we should maybe just bring it down to a scale from 1-3. DONE
Something like a bar chart that shows drop out starting from visiting website -> clicked start mapping -> finished tutorial -> finished 1 mission -> etc. I am thinking y-axis is from 0-100%, where 100% visited website, and a progressively smaller percentage went through the next steps. This will help to see where the biggest bottlenecks are in getting people started; when do they give up? (We also want to do something different for existing users or people who sign up in the middle etc).
Histogram of time spent using Project Sidewalk with median and average marked. Also, perhaps include a table that also includes this info but differentiates between signed in and anonymous users (Jon's suggestions). This graph (and the following three) give us an idea of how engaging our tool is; how long can we keep users motivated and engaged? Do they all give up after their 3rd 0.5mile mission, so we should change things up by the 2nd 0.5 mile mission? Do they give up after their 1st 0.5 mile mission, so we should make them shorter?
Histogram of number of labels per mission with median and average marked. Again, would like summary table comparing signed in and anonymous users (Jon's suggestions)
Histogram of number of missions per person with median and average marked. Again, would like summary table comparing signed in and anonymous users (Jon's suggestions) DONE
Histogram of number of logins per registered user with median and average marked (Jon's suggestions) DONE
Histogram of neighbourhood completion percentages. Over time, this will show us the trend towards 100% coverage. It also gives us an idea of how well new users are distributed to neighbourhoods. DONE
Add choropleth of completion % for neighbourhoods, as suggested in #622. This should definitely be done, but will take me a bit more time than the other changes, so I plan to do the easier changes first. DONE
In Daily Label Count graph, we could plot a line for each label type on the same graph, with their associated colour.

jonfroehlich commented 7 years ago

I skimmed this list and it seems reasonable on a first take. I also appreciated that you went through all open Issues looking for those that were admin dashboard related. Thanks for doing this.

One comment about "Coverage Rate per Neighbourhood % graph." I think you should actually do as suggested in https://github.com/ProjectSidewalk/SidewalkWebpage/issues/412.

Also, @misaugstad, can you go back through your list and add in an expected value proposition for each graph that you plan to create.

misaugstad commented 7 years ago

For the DC coverage percentage chart, do we prefer an area or line graph?

coverage-area coverage-line

I will also bring down the interval between ticks on the y-axis from 10% points to 20.

Any other feedback on this?

jonfroehlich commented 7 years ago

I like area graph. Any reason why this is square. I think it could be more landscape (longer horizontal than vertical).

But, of course, you may have an overall layout plan for the admin page that I'm not aware of, which may justify the square design. :)

On Sat, Jun 17, 2017 at 1:20 PM, Mikey Saugstad notifications@github.com wrote:

For the DC coverage percentage chart, do we prefer an area or line graph?

[image: coverage-area] https://user-images.githubusercontent.com/6518824/27254853-62232cf8-535f-11e7-9898-bb1b4365c0f1.png [image: coverage-line] https://user-images.githubusercontent.com/6518824/27254854-622c010c-535f-11e7-8cc8-fa19439b7d06.png

I will also bring down the interval between ticks on the y-axis from 10% points to 20.

Any other feedback on this?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/342#issuecomment-309228224, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9SmvLV2yckm_gBUOzb9RzJkWqA_Nks5sFArCgaJpZM4KVX_B .

-- Jon Froehlich Assistant Professor Computer Science University of Maryland, College Park http://www.cs.umd.edu/~jonf/ @jonfroehlich https://twitter.com/jonfroehlich - Twitter

misaugstad commented 7 years ago

I agree, more horizontal!

And thoughts on these histograms of severity rating by label type?

severity-ratings

manaswisaha commented 7 years ago

Few comments:

X-axis ticks should be rotated
The y-axis scale should be the same across all the graphs imo.

misaugstad commented 7 years ago

Will do for the x-axis ticks, didn't catch that before, thanks!

And for the y-axis scale, ideally I would keep the scale across all of them, but there are so many more curb ramps than the other labels, that the other histograms would be hard to read if we did counts. But then if we went with a proportion scale (from 0 to 1), we would be losing the counts information.

So that is my current rationale, but can still be swayed!

jonfroehlich commented 7 years ago

I do not think the y-axes should b the same for this dataset. Doing so, would completely obscure the less common label types and trends therein. If keeping the y-axis is super important, could switch it to percentage of labels; however, I prefer raw counts and scaling per type as you have it.

Other than that, I think the graphs are too tall proportionate to their width. Shrink them by 20-30% vertically or so?

Sent from my iPhone

On Jun 17, 2017, at 5:42 PM, Mikey Saugstad notifications@github.com wrote:

Will do for the x-axis ticks, didn't catch that before, thanks!

And for the y-axis scale, ideally I would keep the scale across all of them, but there are so many more curb ramps than the other labels, that the other histograms would be hard to read if we did counts. But then if we went with a proportion scale (from 0 to 1), we would be losing the counts information.

So that is my current rationale, but can still be swayed!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

misaugstad commented 7 years ago

They have been shrunk! Next up for review... I extended the daily label counts graph to go back to the end of 2015, now again do you prefer line or area..? daily-label-counts-area daily-label-counts-line

Again, I think area looks nicer.

manaswisaha commented 7 years ago

I agree. And I think those selector zooming things (don't remember the actual names :P) would be helpful.

Best Regards, Manaswi Saha Ph.D. Student Department of Computer Science University of Maryland, College Park http://cs.umd.edu/~manaswi/ Twitter - @manaswisaha https://twitter.com/manaswisaha

On Sat, Jun 17, 2017 at 8:17 PM, Mikey Saugstad notifications@github.com wrote:

They have been shrunk! Next up for review... I extended the daily label counts graph to go back to the end of 2015, now again do you prefer line or area..? [image: daily-label-counts-area] https://user-images.githubusercontent.com/6518824/27257152-e20f8bd4-5399-11e7-9782-ccdb122011ab.png [image: daily-label-counts-line] https://user-images.githubusercontent.com/6518824/27257153-e213b1f0-5399-11e7-91f8-2e4ca6e68d02.png

Again, I think area looks nicer.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/342#issuecomment-309248298, or mute the thread https://github.com/notifications/unsubscribe-auth/ACvXgEJhO5wNqL9lLEc1jX2bOGru3Wnwks5sFGybgaJpZM4KVX_B .

jonfroehlich commented 7 years ago

Area is better.

Sent from my iPhone

On Jun 17, 2017, at 8:29 PM, Manaswi Saha notifications@github.com wrote:

I agree. And I think those selector zooming things (don't remember the actual names :P) would be helpful.

Best Regards, Manaswi Saha Ph.D. Student Department of Computer Science University of Maryland, College Park http://cs.umd.edu/~manaswi/ Twitter - @manaswisaha https://twitter.com/manaswisaha

On Sat, Jun 17, 2017 at 8:17 PM, Mikey Saugstad notifications@github.com wrote:

They have been shrunk! Next up for review... I extended the daily label counts graph to go back to the end of 2015, now again do you prefer line or area..? [image: daily-label-counts-area] https://user-images.githubusercontent.com/6518824/27257152-e20f8bd4-5399-11e7-9782-ccdb122011ab.png [image: daily-label-counts-line] https://user-images.githubusercontent.com/6518824/27257153-e213b1f0-5399-11e7-91f8-2e4ca6e68d02.png

Again, I think area looks nicer.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/342#issuecomment-309248298, or mute the thread https://github.com/notifications/unsubscribe-auth/ACvXgEJhO5wNqL9lLEc1jX2bOGru3Wnwks5sFGybgaJpZM4KVX_B .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

misaugstad commented 7 years ago

Okay how do they look now?

But first, some of my notes:

For the daily audit and label count graphs, since basically a third of the data are 0's (due to this being a dump from December), the medians are 0 so you can't see them
On a related note, the distributions in the histograms are thus overly skewed towards 0. So right now the histograms don't look very good, but I'd like to see how it looks on real data before messing with the binning or anything like that.
I hope you like pink!
The std for the onboarding completion time is really high because inevitably some people leave their computer and come back. The 5 longest completion times are 122, 87,84,67,64 minutes. Do you want to filter out anything? Like can we assume that the onboarding does not take an hour? Can we filter out those over 45 minutes..?

maybedone1 maybedone2 maybedone3 maybedone5 maybedone6

jonfroehlich commented 7 years ago

Thanks for making these updates so quickly. We can work on the color scheme. :)

One thing I forgot to mention, I'd really like it if you could label the mean and median lines on the graph.

jonfroehlich commented 7 years ago

After you finish addressing these comments, I'd really like you to start focusing on the more user-centeric analyses in your checklist: that is, trying to better understand user behavior: how long do they spend on PS? how many missions do users complete? what does typical labeling behavior look like? what are the differences between registered vs. anonymous users? All of these questions, I think, can be best answered via histograms with descriptive stats.

The visualizations you've created thus far give us a better sense of the data but not of users particularly.

manaswisaha commented 7 years ago

@jonfroehlich Shall I start reviewing the PR on the current changes to the admin page and integrate it with the dev server? Probably would be good for testing out these changes as well.

jonfroehlich commented 7 years ago

Yes, if you can. I'm still not sold on this whole checklist thing--seems like an abuse of the github Issue system and it's hard for me to refer to explicit checkboxes in this comment thread. I'd imagine you can't refer to explicit checkboxes in commits and pull requests either...

manaswisaha commented 7 years ago

Yes, that's true. You can't refer to them explicitly. I think Mikey used it for his personal use to check off things he needs to complete rather than us referring to it elsewhere.

misaugstad commented 7 years ago

Shall I start reviewing the PR on the current changes to the admin page and integrate it with the dev server? Probably would be good for testing out these changes as well.

I definitely think this should be done ASAP. I made changes to queries that affect other parts of the tool, including adding a new table to the database, so I really want that these changes to be included in all the testing on the dev server that happens. Adding a few more histograms and making cosmetic changes can be done later, even after relaunch, and don't need as much testing.

One thing I forgot to mention, I'd really like it if you could label the mean and median lines on the graph.

Do you mean the same thing here, where "labeling" the mean and median mean putting the actual value in text overlayed on the graph?

After you finish addressing these comments, I'd really like you to start focusing on the more user-centeric analyses in your checklist: that is, trying to better understand user behavior: how long do they spend on PS? how many missions do users complete? what does typical labeling behavior look like? what are the differences between registered vs. anonymous users? All of these questions, I think, can be best answered via histograms with descriptive stats.

:+1:

I'm still not sold on this whole checklist thing--seems like an abuse of the github Issue system and it's hard for me to refer to explicit checkboxes in this comment thread. I'd imagine you can't refer to explicit checkboxes in commits and pull requests either...

I will switch them to a numbered list for easier referral.

A side note: In a lot of cases, I imagine that we would want to see the differences between pre-relaunch and post-relaunch. This is baked into the time-series graphs, but for histograms they are all combined. How do we think it would be best to visualize the differences? Have three histograms for each stat: pre-relaunch, post-relaunch, and combined? And then actually make it 6 histograms, since we would split between registered and anon users..?

Or would we want to have a button you can click, like there now is to choose sorting order for neighborhood completion %, so that there aren't just 6 graphs on the page per stat.

I think we most need this when looking at onboarding completion times. We have lengthened the tutorial for relaunch, so we don't just want that summed up with all past tutorial times.

jonfroehlich commented 7 years ago

Do you mean the same thing here, where "labeling" the mean and median mean putting the actual value in text overlayed on the graph?

Yes. Probably should be located towards the top of graph to avoid overlap with other objects

Re: pre-relaunch vs. post-relaunch. I have been thinking of this as well. Not sure how to deal with it exactly but your suggestions make sense. I would prefer that you work on this post-relaunch, however, and focus on major analytics that we already brainstormed for relaunch. Oh, also, this line of thinking brought up another point we discussed: are we tracking versions in a server database to make these sorts of queries easier? @manaswis, I remember we talked about this--did we end up adding another table that maps dates to version numbers so that we can setup semantic queries rather than date-based queries...

manaswisaha commented 7 years ago

@jonfroehlich I created an issue for it - #653 but its not addressed yet. Relaunch system fixes took precedence. I can add that after the relaunch. We can note the date when it is launched. So it shouldn't be hard to populate it after the relaunch.

jonfroehlich commented 7 years ago

@manaswis. Right, makes sense. We can go back and populate table. Thanks.

misaugstad commented 7 years ago

I've made a histogram of missions completed per user (not for a single session, counting all of their logins). First question I have: should I throw everything over 100 missions into its own bin like I did with the onboarding completion time histogram? Also the standard deviation seems wrong, but that may be a copy/paste error, will look into it.

registerd-completed-missions

jonfroehlich commented 7 years ago

Great. Glad we are getting started on the user-centric analytics.

Can you try switching bin size to 5?
I'd rather that we not have a catch all bin unless we really need it. I would prefer to see all of the data. It's fine if the bars shrink in width to compact the graph a bit.
What are the gray vertical lines for every 50 values on x-axis. I don't think we need them
Yes, please recompute standard deviation. Also, have at least one decimal point
Please update legend to print the mean and median values right there. Maybe format like: mean: xx.x

misaugstad commented 7 years ago

Those would be grid lines, common in graphs :slightly_smiling_face: Though they have no use on the x-axis in a histogram!

misaugstad commented 7 years ago

After a hard-fought battle with Scala, Slick, and SQl, I have won. And here is a histogram of missions completed by "anonymous users"

anon_mission_counts

Making the x-axis labels integers and adding the mean/median labels to the legend or graph is still coming. But on to the more pertinent question... How do we want to define an anonymous user?

Using the criteria that IPs must have completed auditing at least one street, in the dump from December there are 169 IPs/users. If you just take any IP address that shows up in the AuditTaskInteractions table, that is about 1000 IPs.

When doing onboarding, activity is logged in the AuditTaskInteractions table, so I don't think it makes sense to just use any IP present in there as an anonymous user, since they could have just clicked on the tutorial and immediately left. This info will be useful for making retention graphs, but not for when we are looking at "anonymous users".

Another benchmark that I have considered for determining an "anonymous user" is to take anyone who finishes the tutorial, taken from the same table. However, that isn't always logged in the interactions table I think, since they could skip the tutorial or having finished the tutorial could be saved in the browser, etc. Note that having finished the onboarding is different from completing an audit task, since finishing onboarding does not log a "TaskEnd" in the table, it logs a "Onboarding_End".

We could also just take only those IPs that have finished an entire mission.

misaugstad commented 7 years ago

Here is another graph! Jon is messing with the data :smirk:

logins-per-user

jonfroehlich commented 7 years ago

Yikes, we may want y-axis to be logarithmic here otherwise first bin overpowers other data...

misaugstad commented 7 years ago

I am realizing that it can be quite difficult to make a histogram of integers on a log scaled x-axis look good!

I think that we need a table of researchers (that you had mentioned earlier) so that we can be removed from graphs like this. 5 out of the 6 highest values in this graph are from researchers, and there is really no reason to include us in this graph. For most of the user-centric graphs, we really don't need to be included. At the very least, we need to have the option of looking at the user-centric data with us taken out.

jonfroehlich commented 7 years ago

@misaugstad: I was thinking you would do log scale on y-axis, which seems simple enough imo.

Re: table. Yes.

misaugstad commented 7 years ago

@ myself, in answer to my question about defining an anonymous user, it seems that there was discussion about this before I arrived, noticed when reading through #323

misaugstad commented 7 years ago

With a lot of the analytics that we want to look at, there are 5 groups of users that we may want to see a graph for: all users, all users minus researchers, registered users, registered users minus researchers, and anonymous users. To try and get all that information without taking up a huge amount of space, I figure that we could have a button (or something like that) that would toggle whether we include the researchers in our histograms. I coded this up for one set of histograms, pictured below:

Defaults to including researchers... including_researchers

Then upon clicking "exclude researchers", the viz updates... excluding_researchers

Thoughts?

jonfroehlich commented 7 years ago

I like it. Use a checkbox 'Include researchers' and default to off?

On Sun, Jul 2, 2017 at 7:03 PM, Mikey Saugstad notifications@github.com wrote:

With a lot of the analytics that we want to look at, there are 5 groups of users that we may want to see a graph for: all users, all users minus researchers, registered users, registered users minus researchers, and anonymous users. To try and get all that information without taking up a huge amount of space, I figure that we could have a button (or something like that) that would toggle whether we include the researchers in our histograms. I coded this up for one set of histograms, pictured below:

Defaults to including researchers... [image: including_researchers] https://user-images.githubusercontent.com/6518824/27774109-b39e2d76-5f58-11e7-8a5a-aa3811acda66.png

Then upon clicking "exclude researchers", the viz updates... [image: excluding_researchers] https://user-images.githubusercontent.com/6518824/27774110-b84768f6-5f58-11e7-86b8-45b4a9a926c6.png

Thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/342#issuecomment-312522402, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9cZt_a5pNIYgjjXcicxXcb935wimks5sKCGygaJpZM4KVX_B .

-- Jon Froehlich Assistant Professor Computer Science University of Maryland, College Park http://www.cs.umd.edu/~jonf/ @jonfroehlich https://twitter.com/jonfroehlich - Twitter

misaugstad commented 7 years ago

So here are a few new graphs for the admin interface. Any comments before I submit a PR for them so they can be included in the stress testing today?

new-graphs

Right now, the anonymous user labels are incorrect, in the same way as issue #791. I have an idea for how to fix it, but if that doesn't work, the fix may not happen today. The way to fix it is to use a select distinct, but Slick doesn't have select distinct directly built in (well, maybe they do in the newest version, but that isn't well documented yet anyway). So my idea is for a workaround.

Also, the following graphs are now bar graphs instead of area graphs. The area versions can be seen at this comment. (Note that the histograms next to these graphs are not pictured, and that is where the legends are)

daily-bar-graphs

misaugstad commented 7 years ago

@r-holland is going to get started on making a graph of time spent using Project Sidewalk, where we count 5+ minutes of inactivity as not using the tool. This should be a good intro to our backend, while providing something very useful for the dashboard!

jonfroehlich commented 7 years ago

Sounds good. Thanks Mikey.

On Mon, Jul 3, 2017 at 11:43 AM, Mikey Saugstad notifications@github.com wrote:

@r-holland https://github.com/r-holland is going to get started on making a graph of time spent using Project Sidewalk, where we count 5+ minutes of inactivity as not using the tool. This should be a good intro to our backend, while providing something very useful for the dashboard!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/342#issuecomment-312679044, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9VBRCM-CrI7Bv5B0xPE01RPTAjLaks5sKQwZgaJpZM4KVX_B .

-- Jon Froehlich Assistant Professor Computer Science University of Maryland, College Park http://www.cs.umd.edu/~jonf/ @jonfroehlich https://twitter.com/jonfroehlich - Twitter

r-holland commented 7 years ago

Looks like I have most everything working:

Unfortunately, @misaugstad explained to me that a query similar to the one I am conducting caused the server to crash a while ago. I will work on implementing more intermediate calculations on the back end to hopefully reduce the query size.

jonfroehlich commented 7 years ago

Thanks @r-holland.

Can you add in actual values for the mean & median
Does the final column actually represent all auditing times > 180 minutes?
The y-axis is counts of people?

r-holland commented 7 years ago

@jonfroehlich Does this updated graph address your concerns?

jonfroehlich commented 7 years ago

Can you show me this w 5 min and 10 min bins in addition to 20. Also, did you switch out the compute function for this.

Sent from my iPhone

On Jul 11, 2017, at 8:44 AM, Ryan Holland notifications@github.com wrote:

@jonfroehlich Does this updated graph address your concerns?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

r-holland commented 7 years ago

audit_time_histogram_updated audit_time_histogram_updated_2 audit_time_histogram_updated_3

I am still in the process of switching the computation function to the back end, been busy with other issues. Plan is to get this done today.

jonfroehlich commented 7 years ago

Thanks. I like 5 or 10 min bins the best.

On Tue, Jul 11, 2017 at 9:18 AM, Ryan Holland notifications@github.com wrote:

[image: audit_time_histogram_updated] https://user-images.githubusercontent.com/19720010/28072845-27444f2e-6622-11e7-8f73-c77932b77560.PNG [image: audit_time_histogram_updated_2] https://user-images.githubusercontent.com/19720010/28072843-2740c73c-6622-11e7-80b0-cba270155b51.PNG [image: audit_time_histogram_updated_3] https://user-images.githubusercontent.com/19720010/28072844-274243aa-6622-11e7-8bee-8d86d22af75c.PNG

I am still in the process of switching the computation function to the back end, been busy with other issues. Plan is to get this done today.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/342#issuecomment-314458594, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9dWokKVw-xT3pCqrDbm3alOLXBrxks5sM4QqgaJpZM4KVX_B .

-- Jon Froehlich Assistant Professor Computer Science University of Maryland, College Park http://www.cs.umd.edu/~jonf/ @jonfroehlich https://twitter.com/jonfroehlich - Twitter

r-holland commented 7 years ago

Thanks to the discovery of lag(), my new favorite SQL function, I have dropped the total query time to about 2 seconds. All of the calculations are back end now. @misaugstad also suggested limiting the query to only ModeSwitch events...I will let him explain his own reasoning:

"ModeSwitch_Walk is logged during both panning and "walking", and the other mode switches are for the different label types. So if no one pans, changes pano, or switches to a labeling mode for 5 min, they probably aren't doing anything really."

Here is the updated histogram:

misaugstad commented 7 years ago

@r-holland how is it possible that after removing some of the interactions, the average amount of time spent actually went up? Shouldn't it be that removing interactions should result in, at most, the same amount of time spent as with all interactions?

misaugstad commented 7 years ago

You should run your current implementation, but for all interactions, and make sure that it looks the same as your original graphs. And if it doesn't, we need to find out why

misaugstad commented 7 years ago

Also this is only looking at registered users, which should be mentioned. And we should do this for anonymous users as well. And that should not be hard at all, you just group by IP address instead of user id

jonfroehlich commented 7 years ago

Is this Issue still active or should we close it out?

misaugstad commented 7 years ago

Yep, there were just a couple remaining charts that had not been created, so I made separate issues for each of them. Closing this one now. Nice work, team!

ProjectSidewalk / SidewalkWebpage

Improve the charts on the admin interface #342