Closed AndrewLim1990 closed 7 years ago
I noticed this too! I think it's because there are a bunch of NAs. We should add either add NA to the menu, or filter those students out from the beginning.
Oh I see! I think that may be part of the issue. However I noticed the following:
(Number of students less than 30 mins) + (Number of students between 30 mins and 5 hrs) + (Number of students greater than 5 hrs) > All students
I might misunderstand but I think the NA problem would manifest in such a way where:
(Number of students less than 30 mins) + (Number of students between 30 mins and 5 hrs) + (Number of students greater than 5 hrs) < All students
Regardless, I'll take a closer look on what is going on for videos this weekend! Unfortunately, I think all dashboards are calculating their activity levels with separated functions. We may want to combine them eventually.
Oh! Ha. Yes, my NA explanation does not explain what you're observing. Just checked the code and found the problem in overview_engagement_server.R
. In filter_demographics
, the following code should have ==
rather than !=
:
if (activity_level == "under_30_min") {
filtered_df <- filtered_df %>% filter(activity_level == "under_30_min")
}
if (activity_level == "30_min_to_5_hr"){
filtered_df <- filtered_df %>% filter(activity_level != "30_min_to_5_hr") ### <- here
}
if (activity_level == "over_5_hr"){
filtered_df <- filtered_df %>% filter(activity_level != "over_5_hr") ### <- here
}
I'm not sure which views are using which filter functions, but I wrote a separate one for the forum view since it has categories instead of modules. Do you mind investigating?
I will do it now!
Hmmm this doesn't seem to be problem for me:
filter_demographics <- function(input_df, gender = "all", activity_level = "all", mode = "all") {
filtered_df <- input_df
if (gender == "female") {
filtered_df <- filtered_df %>% filter(gender == "f")
}
if (gender == "male") {
filtered_df <- filtered_df %>% filter(gender == "m")
}
if (gender == "other") {
filtered_df <- filtered_df %>% filter(gender == "o")
}
if (activity_level == "under_30_min") {
filtered_df <- filtered_df %>% filter(activity_level == "under_30_min")
}
if (activity_level == "30_min_to_5_hr") {
filtered_df <- filtered_df %>% filter(activity_level == "30_min_to_5_hr")
}
if (activity_level == "over_5_hr") {
filtered_df <- filtered_df %>% filter(activity_level == "over_5_hr")
}
if (mode == "audit") {
filtered_df <- filtered_df %>% filter(mode == "audit")
}
if (mode == "verified") {
filtered_df <- filtered_df %>% filter(mode == "verified")
}
return(filtered_df)
}
Will take a closer look this weekend
Two related issues:
It is surprising that the #s are different across the different views. Psyc 1 (all, <30, 30-5, >5): problems: 2657, 596, 1416, 2409 video: 3496, 1810, 2055, 3240 These should be # of learners in the course with that profile of activity, and thus should have overall course #s, even if they happened to not do videos or problems.
The overview page specifically seems to have additional weird behaviour: 3024, 0, 2177, 2177
ido
On Wed, Jun 28, 2017 at 5:43 PM, AndrewLim1990 notifications@github.com wrote:
Hmmm this doesn't seem to be problem for me:
filter_demographics <- function(input_df, gender = "all", activity_level = "all", mode = "all") { filtered_df <- input_df if (gender == "female") { filtered_df <- filtered_df %>% filter(gender == "f") } if (gender == "male") { filtered_df <- filtered_df %>% filter(gender == "m") } if (gender == "other") { filtered_df <- filtered_df %>% filter(gender == "o") } if (activity_level == "under_30_min") { filtered_df <- filtered_df %>% filter(activity_level == "under_30_min") } if (activity_level == "30_min_to_5_hr") { filtered_df <- filtered_df %>% filter(activity_level == "30_min_to_5_hr") } if (activity_level == "over_5_hr") { filtered_df <- filtered_df %>% filter(activity_level == "over_5_hr") } if (mode == "audit") { filtered_df <- filtered_df %>% filter(mode == "audit") } if (mode == "verified") { filtered_df <- filtered_df %>% filter(mode == "verified") } return(filtered_df) }
Will take a closer look this weekend
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AndrewLim1990/mooc_capstone_public/issues/1#issuecomment-311830574, or mute the thread https://github.com/notifications/unsubscribe-auth/AKEQMMi1nQQ9zPkBKLJ6KmacufUoIFW5ks5sIvMtgaJpZM4OIrLA .
Found the problem. There are two functions named filter_demographics. Both are purposed to do the same thing. I'm going to delete the one in @subizhangg 's overview_engagement_server.R
if that is okay.
Let me know what you think @subizhangg
Regarding Ido's comment, you are right. The numbers at the top are the number of learners associated with that specific course element. For example, in the video dashboard, we would not be taking into account any learners that have never touched a video and only did problems. The same is true vice versa for the problems.
To me, this is preferable over trying to get all the numbers to be the same. For example, if an instructor has a course where most of the students don't engage with the videos and only did problems, it may be confusing to have them see a large number in the filtering panel yet see low numbers in the plots.
However, I can also see why the opposite is sometimes preferable.
If you feel strongly about it, perhaps we can make it a new request in a separate issue.
I reply in a new issue
merged PR fixing filtering bug
The numbers when filtering activity levels do not add up to the total when "All" is selected