jennyzhang0215 / DKVMN

Dynamic Key-Value Memory Networks for Knowledge Tracing
136 stars 51 forks source link

Which assistment 2009? #4

Open clara2911 opened 4 years ago

clara2911 commented 4 years ago

Which assistment 2009 dataset are you using? According to this link (https://sites.google.com/site/assistmentsdata/home/assistment-2009-2010-data/skill-builder-data-2009-2010) there are 2 versions after the detection of the duplicate row problem. Version (1) with one row per student-problem-skill and version (2) with one row per student-problem. If the problem has multiple skills it is give as skill1_skill2 (see image below).

Are you using version 1 or version 2? Thank you for your help!

image

dxywill commented 4 years ago

@clara2911 I think neither versions are used. It seems there are a lot versions of assistment2009 (they keep updating this dataset?). I downloaded this version https://drive.google.com/file/d/0B3f_gAH-MpBmUmNJQ3RycGpJM0k/view?usp=sharing few weeks ago and tested on DKT model, I could only get auc around 0.74. (My DKT model could reproduce the results from this paper https://files.eric.ed.gov/fulltext/ED592679.pdf, so I guess its not the problem of my model)

clara2911 commented 4 years ago

@clara2911 I think neither versions are used. It seems there are a lot versions of assistment2009 (they keep updating this dataset?). I downloaded this version https://drive.google.com/file/d/0B3f_gAH-MpBmUmNJQ3RycGpJM0k/view?usp=sharing few weeks ago and tested on DKT model, I could only get auc around 0.74. (My DKT model could reproduce the results from this paper https://files.eric.ed.gov/fulltext/ED592679.pdf, so I guess its not the problem of my model)

Thanks a lot for your answer - indeed there are a lot of different assistment2009 versions. I will try this version you mentioned.

Still it would be amazing if the authors of this paper can give a definite answer on which version they used.

jennyzhang0215 commented 4 years ago

Sorry for the late reply. I have no idea which version we used. But I remember that in our experiments one exercise maps to one skill, so maybe the version with one row per student-problem-skill. Or could you use the dataset we preprocessed in the Data folder? Thanks.

dxywill commented 4 years ago

@clara2911 I implemented a pytorch version (https://github.com/dxywill/pytorch_dkvmn) that directly use the file https://drive.google.com/file/d/0B3f_gAH-MpBmUmNJQ3RycGpJM0k/view?usp=sharing if you are interested