arghosh / AKT

MIT License
93 stars 31 forks source link

Why don't all the models care repeated response sequences with different skill tagging? #5

Closed laizef closed 3 years ago

laizef commented 3 years ago
In both assistment2009 and assistment2017, some problems contain more than one skills. For example, there is a sequence in lines 854-856, assist2009_pid_test1.csv: problemId 7374 7374 7362 7362 7421 7421 8287 8287 7372 7372 7425 7425
skillId 37 54 37 54 37 54 45 54 37 54 37 54
correct 0 0 1 1 0 0 1 1 1 1 1 1

Each of problems 7374, 7362, 7421, 8287, 7372, 7425 contains 2 skills. The student acts for only 6 times but 12 actions are recorded. We should not predict the performance of 2nd, 4th, 8th, 10th, 12th steps on the basis of information 1st, 3rd, 5th, 7th, 9th, 11th steps, respectively, because they are unavailable in reality. In addition, performance of 2nd, 4th, 8th, 10th, 12th steps is the same with that of 1st, 3rd, 5th, 7th, 9th, 11th steps, respectively, because they are actually from the same action.

In fact this problem is illustrated by Xiong et al. (Going Deeper with Deep Knowledge Tracing) in 2016. Why don't all the models care repeated response sequences with different skill tagging?

In assistment2017, some problems contain more than one skill as well, but in your processed data, each action only contains one skill. For example, in lines 14-16, assist2017_pid_test1.csv, problem 877 contains skill 6; but in lines 10-12, problem 877 contains skill 65. I don't know whether it disturbs Rasch model-based Embeddings in AKT.

arghosh commented 3 years ago

I agree. This repo (and the results) mainly follows the (not-so-correct) settings in DKVMN, SAKT, DKT to have parity with previous results. In Table 6 (of our paper), we list the results for your setting. And it is easy to add multiple skill tags for each question in AKT/DKT/DKVMN/SAKT.