Closed bcipolli closed 8 years ago
In reviewing our data collection and the stories submitted to us, I'm convinced that there is in fact a lot of video usage happening while not logged in, and that we're missing a significant amount of data that is syncable, by simply not recording it.
@jamalex any interest in designing and/or implementing this? To me, this is one of the highest priority/cost dev items on our plate--high priority, I think relatively low cost to get a good design and implement it.
Assigning to myself so I don't lose track, while I await @jamalex response.
Absolutely LOVE the data collection we've started on video downloads and language pack downloads. LOVE that we have the registration requirement to use online features.
This one is next, and I think there's a relatively easy way to do it:
__anonymous__
facility user__anonymous__
Consequences of this design:
Will think more about whether another design might be preferable (like an AnonymousUser
extension of FacilityUser
, which is one-way syncable... but then the ExerciseLogs would need to be one-way synable...)
or
user=None
on the *Log objects, use that as indicators of anonymousness, and do a one-way-sync hack-job that way. That might actually work out better, though it would take some work on the reporting side to show admins the amount of anonymous data that's being collected.
Hmm, problem with this is, we use the user
to determine the zone with which the *Log
object belongs. So these would globally collide.
I have the anonymous = None
version implemented on my machine, but will think through further.
Love the idea. In some ways, we even want every anonymous "session" to create a new instance of the anonymous facility user. Only downside of that is it could be messy on coach reports (but agree it's nice to show anon usage stats there). Perhaps we could do the "every new session is a new anon user" approach, but have some special (simple) logic to filter these out in places like the coach reports, only showing something aggregate. We'll probably need special logic somewhere anyway (either in JS or API) to not have it show progress (e.g. a full streak bar) to a non-logged in user, though, as well -- to avoid confusion.
A versioned "is_anonymous" field could be added to FacilityUser, and we could then disallow logins, showing of progress, etc, for those "users".
Cool! Yeah, I thought about these explicit options as well; my hesitation is that I worry special-case code will need to be wide-spread.
I've done an initial demo branch to explore simply setting user=None
for the *Log
objects. The code is relatively constrained (though some non-widespread use of collision to get anonymous users to be per-device instead of global).
https://github.com/bcipolli/ka-lite/compare/learningequality:develop...bcipolli:706?expand=1
I'm also still unsure about all of the facility stuff. Could make a facility just for anonymous users, but this would again require special-case code.
A solution that requires as little special-case code as possible is strongly preferred for me, even if the data collected are sub-optimal, as the complexity of this app is already pretty high.
Yeah... (Facility could be the default facility, maybe..).
Perhaps the cleanest would be a new AnonymousUsage model, which would store aggregate stats (total hours watched, total answers given, etc, per video/exercise)? It wouldn't need to be associated with a user or facility, and wouldn't need to duplicate the logic from the ExerciseLog and VideoLog models, since it doesn't need to track points, etc. Then, the only custom logic anywhere would be either in the JS for saving progress (probably best, so it doesn't even show live point updating, etc), or in the log API calls. Just a thought.
(wouldn't need to be multiple models, could just be one, with "kind" and "id" fields to identify videos vs exercises)
Hmm I don't see this essential for 0.11.1. Punting to 0.11.2.
@jamalex I am really warming up to the AnonymousUsage
model. It's clean, it could follow the logic from the UserLogSummary
functionality (for grouping with a particular granularity), and I think there's much more value currently of the overall usage than the per-exercise/per-video log data.
This would require us to implement the "one-way sync" (would be done soon anyway).
@aronasorman @rtibbles any thoughts / concerns?
For me, this is an essential piece to get out the door, to help us try and go from "installations" to "usage". I know this may wind up telling us nothing (because we'll truly never see offline installations again), but it seems like our best shot in the meantime.
Cool! Note: I think there's still value in having AnonymousUsage
be per-exercise/per-video (basically just a running total across all anon usage, per media entity), as it would be cool to be able to say what items people are watching/answering.
Agree; lots of possible ways to do this that are pretty generic and would generalize well to data collection in non-KA content scenarios.
Glad to hear you're on board for OneWaySync too! The anonymous usage also sounds helpful, and I can see the value in following the UserLogSummary model of aggregating over certain time windows.
On Thu, Feb 27, 2014 at 9:29 AM, Ben Cipollini notifications@github.comwrote:
Agree; lots of possible ways to do this that are pretty generic and would generalize well to data collection in non-KA content scenarios.
Reply to this email directly or view it on GitHubhttps://github.com/learningequality/ka-lite/issues/706#issuecomment-36267714 .
Richard
Tentatively assigning to myself; this is a ~1-2 hour project, once one-way sync (required) is done. Lower priority item.
Booting to others. Once OneWaySync
is done, this should be really easy. I suggest moving this, along with the UserLog
functionality, into a kalite.stats
app. This will avoid any funky inter-app dependencies by shoving stats collection into other apps (like main
or facility
); instead, most apps can simply import kalite.stats
... which makes lots of sense.
Sounds like a good design.
0.13.
I'm not up to date on where we are with regards to our data collection system. @rtibbles how close are we to implementing this in the current develop?
Way off.
On Tue, 26 May 2015 12:29 Aron Fyodor Asor notifications@github.com wrote:
I'm not up to date on where we are with regards to our data collection system. @rtibbles https://github.com/rtibbles how close are we to implementing this in the current develop?
— Reply to this email directly or view it on GitHub https://github.com/learningequality/ka-lite/issues/706#issuecomment-105642115 .
Not going to happen within the scope of KA Lite.
In order to understand KA Lite usage worldwide, we need as much usage data as possible. However, when no users exist or are created, we simply do not save any usage data.
We can record all *Log data, as usual, when no user is logged in, or when Django users are logged in.
Possible designs: