ECNU-ILOG / HyperCDM

MIT License
5 stars 0 forks source link

Capturing Homogeneous Influence among Students: Hypergraph Cognitive Diagnosis for Intelligent Education Systems

Junhao Shen, Hong Qian*, Shuo Liu, Wei Zhang, Bo Jiang, and Aimin Zhou. (*Correspondence ) Shanghai Institute of AI Education, School of Computer Science and Technology East China Normal University

This repository contains the code for the paper "Capturing Homogeneous Influence among Students: Hypergraph Cognitive Diagnosis for Intelligent Education Systems" published in proceedings of the 30th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024). We also upload the pre-print full paper, titled full paper.pdf in the folder named asset.

πŸ“° News

πŸ’» Getting Started

File Tree

HyperCDM
β”‚  DOA.py
β”‚  homogeneity.py
β”‚  main.py
β”‚  README.md
β”‚
β”œβ”€asset
β”‚      framework.png
β”‚      full paper.pdf
β”‚
└─data
    β”œβ”€a17
    β”‚      a17TotalData.csv
    β”‚      config.json
    β”‚      q.csv
    β”‚
    β”œβ”€EdNet-1
    β”‚      config.json
    β”‚      EdNet-1TotalData.csv
    β”‚      q.csv
    β”‚
    β”œβ”€junyi
    β”‚      config.json
    β”‚      junyiTotalData.csv
    β”‚      q.csv
    β”‚
    β”œβ”€Math1
    β”‚      config.json
    β”‚      Math1TotalData.csv
    β”‚      q.csv
    β”‚
    └─nips20
            config.json
            nips20TotalData.csv
            q.csv

Quick Start

We provide Math1 as sample datasets to validate the HyperCDM. You can reproduce the results by directly running main.py, i.e.

python main.py

Run with other datasets

Step 1. prepare dataset

Refer to the sample dataset, you should prepare the following files:

β”œβ”€dataset
β”‚  └─Your_dataset
β”‚          config.json
β”‚          data.csv   
β”‚          q.csv

Specifically, config.json records all necessary settings of dataset like the number of students, and the format of config.json is shown as following:

{
  "dataset": [String, the name of the dataset],
  "q_file": [string, the relative path of Q matrix],
  "data": [string, the relative path of response logs],
  "student_num": [int, the number of students],
  "exercise_num": [int, the number of exercises],
  "knowledge_num": [int, the number of knowledge concepts]
}

data.csv consists of response logs in the following format:

[int, student_id1],[int, question_id1],[0/1, response to question_id1]
[int, student_id1],[int, question_id2],[0/1, response to question_id2]
...
[int, student_idn],[int, question_idm],[0/1, response to question_idm]

q.csv contains the relevant between questions and knowledge attributes. Each entry in the $i$-th row and the $j$-th column means whether the $i$-th question involves the $j$-th knowledge attributes.

Step 2. coding

Refer to the main.py, you can change the path to different configuration file.

Step 3. run code

python main.py

Necessary Packages

pytorch 1.13.0+cu0.11
scikit-learn 1.1.2
pandas 1.3.2
scipy 1.9.1

πŸ“ž Contact

Should you have any questions and commemtns, post an issue here, or feel free to contact Junhao Shen (first author) and Hong Qian (correspondence author).

πŸ“„ Citation

If you find HyperCDM is helpful and and can inspire you in your reseach or applications, please kindly cite as follows.

BibTex

@inproceedings{Shen2024hypercdm,
 author = {Shen, Junhao and 
           Qian, Hong and
           Liu, Shuo and
           Zhang, Wei and
           Jiang, Bo and 
           Zhou, Aimin},
 booktitle = {Proceedings of the 30th {SIGKDD} Conference on Knowledge Discovery and Data Mining},
 title = {Capturing Homogeneous Influence among Students: Hypergraph Cognitive Diagnosis for Intelligent Education Systems},
 year = {2024},
 address = {Barcelona, Spain},
 page = {}
}

ACM Format

Junhao Shen, Hong Qian, Shuo Liu, Wei Zhang, Bo Jiang, and Aimin Zhou. 2024. Capturing Homogeneous Influence among Students: Hypergraph Cognitive Diagnosis for Intelligent Education Systems. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain.