Open smadha opened 7 years ago
Probability of user answering question again if they didn't answer it the first time: 0.029131121643 Probability of user not answering the question again if they didn't answer it the first time: 0.970868878357
There isn't a case where the user answers the same question again
Sample Data stats:
Number of users in list irrespective of the question was answered or not::: 27127/ 28763 Number of questions in list irrespective of the question was answered or not::: 7708 / 8095
most common question asked irrespective of it was answered or not:: [('8cc470e1c655b5bbf6e8684509b44205', 1016 times it was asked in the given sample)] most common user:: [('d66397df46f4e33cb608c322f751d884', 110 entries for the user are given for this user)] least common user:: ('09d89cf0a43005b22b015b24fe8b29ad', 1 entry is given for this user) least common question asked:: ('09698971cfdcca1b0eb9fd444edc596f', 1 entry is given for this question)
The training sample seems to be skewed: Adding features after taking into account these labels(1/0) can increase the skewness in our features.
245752 LABELED SAMPLES 8095 questions 28763 users
27,324 questions answered 218,428 questions not answered
6182 users answered at least one question 23 users answered more than 50 questions 690 users answered more than 10 questions
5877 questions answered at least once 28 questions answered more than 30 times 705 questions answered more than 10 questions
30467 TEST SAMPLE