data processing on assistments

dxywill / pytorch_dkvmn

Pytorch implementation of DKVMN

23 stars 7 forks source link

data processing on assistments #1

Open Chunpai opened 3 years ago

Chunpai commented 3 years ago

Hi Xinyi,

Thank you for your public code on this implementation. I checked your code for processing the ASSISTments dataset, it seems you use the skill-id not question id for training. I am wondering if this is supposed to be or it was your experimental setting ? Thanks.

dxywill commented 3 years ago

@Chunpai The goal of knowledge tracing is track the mastery level of each skill. Ideally, each question corresponds to one skill. However, in practice, one question usually will have more than one skill. The ASSISTMent dataset handles this in two ways. First, if you have two skills associated with this question, you can have two rows in this dataset. Another way is to combine these two different skills into a new skill, so you only have one row in your dataset.

Knowledge tracing should be always conducted in the skill level, not the question level.

Chunpai commented 3 years ago

@dxywill Thanks Xinyi.

This is confusing, since the input of DKT and DKVMN are (q, a), which should be question id and response, not skill-id and response, right ? For predicting a student's score on a question, I think we should feed in question id not skill-id, right ?

Also, the correlation weight of DKVMN represents the correlation between exercise and each latent concept. If the correlation weight vector contains numerical values, does it mean that each question will have multiple latent concepts or skills ?