Optimal-Learning-Lab / LKT

12 stars 2 forks source link

Required dataset columns for the analysis #12

Closed erkaner closed 1 year ago

erkaner commented 1 year ago

I have a dataset of students' answers to multiple-choice questions in different tests. The dataset includes the following column headers: StudentId, ObjectiveId, QuestionId, Date, and IsCorrect. I'm not sure if I need to format the data in a specific way before conducting the analysis. I appreciate any information.

imrryr commented 1 year ago

How are the different tests marked in the data? There isn't a clear column for that. Is the Data a timestamp that can be used to order the file? Here is the basic operations file (available from CRAN). https://cran.r-project.org/web/packages/LKT/vignettes/Basic_Operations.html

It shows the data format. Do rename your StudentId to Anon.Student.Id and rename your IsCorrect column to Outcome with values CORRECT and INCORRECT

Also, what is your intention/goal in looking at tests, since tests show assessment, not learning usually, many of the LKT features might not be appropriate. You can find student and item and objective parameters, and perhaps if questions are group in objectives, you could use that. Examples are here: https://cran.r-project.org/web/packages/LKT/vignettes/Examples.html

After fixing the columns, you might try this model:

imrryr commented 1 year ago

modelob <- LKT( data = val, interc=FALSE, components = c("Anon.Student.Id","QuestionId","ObjectiveId"), features = c("intercept", "intercept", "lineafm$"))

After correcting the columns, this model will give parameters for each student's ability, parameters for each item difficulty, and a separate learning slope value for each objective (which would require objectives with 2 or more repetitions of items in each objective to measure the learning).

imrryr commented 1 year ago

Also, the file should be sorted by student and then each student should be in temporal order.

erkaner commented 1 year ago

Thanks a lot for the detailed response! Yes, there is actually also a column indicating the test id.

I am confused by this: "since tests show assessment, not learning usually, many of the LKT features might not be appropriate.". You mean in general KT models are not intended to be applied to students responses to MC or TF questions in tests (or quizzes)?

imrryr commented 1 year ago

How do your tests show learning DURING the test? That's what I mean. Many of the features measure change across repetitions. If there are no repetitions (of something), the feature is not computed as intended. If you tell me the goal for the models, it may help.

erkaner commented 1 year ago

Thanks for your patience and clarification!

Let me explain the context a bit better. Toward preparation for high-stakes exams, students take practice tests at specific points, which help instructors check students' knowledge levels and identify if more practice is necessary. Across multiple tests, yes they repeat learning a specific concept.

So, basically, students go through a learning journey and at multiple points they take tests, and I want to identify at which point they mastered a specific concept. In my case, I even know the associated specific learning objectives within each concept, which is I think more useful for the LKT analysis.

imrryr commented 1 year ago

Ah, I see. Then the methods in LKT should work very well. I think the model I suggested is a great place to start. I can offer advice if you want to make a more sophisticated model of learning or need specific outputs. Tell me about any technical difficulties also.