Closed markoprodanovic closed 3 years ago
@alisonmyers
Putting this up here to show you how things are progressing, but also as a means of documenting.
Will come prepared to speak on all of this during tomorrow's meeting ☺️
Thinking back to some of Rajesh's questions about async sessions, I believe that they can be answered in full using the data in these two generated tables.
1. What percentage of the class watched the video? => We can tell you the unique number of users who've accessed the video. Compare this with your class size and you'll have a sense of what percentage of students watched it.
2. What percentage of the video has the class watched? => For every 5% chunk of the video, the data tells us how many unique users watched it (we currently defined "watched" as having viewed >= 90% of it)
3. Which part of the video did students visit again and again => For every 5% chunk we can see who watched it and how much time they spent there. If we average or sum across user activity in chunks we can get a clearer sense of which parts of the video are users watching most and spending the most time in
For Chunking by Dates, consider the following: Try to think about what the data would look like at the individual student level, and how we would want to "roll" that up. We want our data extraction to create entire sets for now, and leave the filtering to a user filter (in Tableau or other).
Consider a scenario like this
Edit - I just realized this was considering 10% chunks, but I think the example still stands
date, chunk_id, number_of_watches
2020-11-01, 1, 1
2020-11-01, 2, 2
2020-11-01, 3, 1
2020-11-01, 4, 1
2020-11-01, 5, 1
2020-11-02, 1, 1
2020-11-02, 2, 1
2020-11-02, 3, 1
2020-11-02, 4, 1
2020-11-02, 5, 1
2020-11-02, 6, 1
2020-11-02, 7, 1
2020-11-02, 8, 1
2020-11-02, 9, 1
So, for this video, perhaps this was the only student, you could aggregate by chunk and count the total views, and the unique users per chunk
chunk, users, n_watches
1, 1, 2
2, 1, 3
3, 1, 2
4, 1, 2
5, 1, 2
6, 1, 2
7, 1, 2
8, 1, 2
9, 1, 2
Hmm, here's a scenario I'd still be worried about.
Let's say a student bounces around the timeline within the first 3 chunks on Nov. 1:
They watch this much of the chunks:
0 - 80% 1 - 40% 2 - 60%
...they've completed no chunks therefore this is there data for that day
date, chunk_index, number_of_watches
2020-11-01, 0, 0
2020-11-01, 1, 0
2020-11-01, 2, 0
The student then access the data on Nov. 3 and goes back to fix the missing gaps in their viewing -- they watch the remaining parts of each chunk:
0 - 20% 1 - 60% 2 - 40%
Like before, on this day, they completed no chunks therefore their data looks like this:
date, chunk_index, number_of_watches
2020-11-03, 0, 0
2020-11-03, 1, 0
2020-11-03, 2, 0
Now we have a situation where a student has finished all 3 chunks but we have no record of completion because each row is a days worth of data.
I think by chunking, we can lose some noise of how much of a chunk was watched, or else we are back to caring about minute by minute activity - which would require a different kind of dataset. So, "watching a chunk" we can decide what this means
I.e. maybe "Watching a chunk" means they watched at least 10% of that chunk in one go to count as a chunk-watch.
(We can do some exploratory analysis to see what makes sense).
I'll think on this a bit more! Good thing to talk through during our meeting.
It's an interesting problem because there's some subjectivity needed - ie. "how do we define completion of a chunk"
And this decision has huge impact on how the data looks.
For fun, here's what the unique view count looks like at chunk completion >= 10% for the same video as above.
Notice how much chunk 1 changes (difference of 115 viewers) Notice how little chunk 3 changes in comparison (difference of 16 viewers)
With this more liberal criterial, the table shows me that lots of people actually did watch chunk 1 The earlier screenshot has stricter completion criteria, but maybe is more useful in the sense that it shows us that more people "meaningfully" engaged with the material at chunk 3
Definitely. I think if we start looking at individual patterns of activity it will tell us something more meaningful about how to define chunks. I.e) if we find students jump around a lot, and watch short bursts, we might want to be more forgiving about "Watching a chunk". If we find that students watch straight through, then we don't need to worry as much.
archived
I’ve managed to create two data tables to help us answer questions about viewership in Panopto
TABLE 1: UNIQUE VIEWERSHIP ACROSS CHUNKS
Example Output (COMM 290) => Unique Viewers: 272
For the purposes of not forgetting what we did, and transparency about how we calculate this, the steps are:
⚠️ Note that the date range that is shown in the data can be adjusted by narrowing the start and end time in the SOAP call. So you can answer questions like, for example, “how many students watched this video between the first and second midterm?”
I tried to play around with including dates somehow as values in the data but they didn’t really make sense to me. Even if a chunk was completed, a user could’ve watched it over multiple times/days. Users can also rewatch chunks, so which date counts?
To me it made the most sense to just say, we can narrow dates by adjusting the call we make so that the raw data that we begin to work with is already filtered to those dates (although I realize this could make it difficult to adjust these via parameters in Tableau )
TABLE 2: CHUNK VIEWERSHIP PER USER
For example, in analyzing the data above (excel)...
As we can see, chunk 3 has notably higher total and average viewership than its neighbours. Also, looking at the first table shows that it has a lot of unique viewership as well
...and indeed going into the content of the video, this section is a walkthrough of a problem solution - so it makes sense that viewership would be more concentrated
Chunks 9-12 also look interesting - in context of the video it's yet another walkthrough of a problem solution followed by a steep dropoff in chunk 13, when the walkthrough is finished, all math disappears from the slides in favour of images
Chunks 17-19 see a pretty major drop-off both in unique viewership as well as totals/averages - and in the context of the video, this is when the instructor ends their slides and the main lecture
In my limited experimenting with COMM290, I found that if a chunk has some combination of:
… it tends to be indicative of a more-engaged-with part of the video. (usually the solutions to an example problem). Pretty cool!