jianxiangyu / MEOW

MIT License
24 stars 0 forks source link

Question about data #1

Open EmotionK opened 9 months ago

EmotionK commented 9 months ago

Could you please provide an explanation of each file in data? For example, what data is nei_a.npz?

jianxiangyu commented 3 days ago

I apologize for taking so long, and I did not notice the issue you raised earlier. In the data folder, the node types in the dataset are represented by lowercase abbreviations. For example, in the ACM dataset, ‘p’ stands for paper, ‘a’ stands for author, and ‘s’ stands for subject.

• nei_a.npz represents the connection relationships between nodes of type paper and nodes of type author.
• a_feat.npz represents the node features for author-type nodes.
• pap.npz represents the induced adjacency matrix for the meta-path PAP.
• labels.npy represents the labels for the papers.
• train_20.npy represents the node indices used as training data for downstream node classification, where each class has 20 labels.
• val_20.npy and test_20.npy represent the validation and test sets under the same case.

The same applies to other datasets.