Training data issues - Githubissues

I found that in stage 1 and stage 2, the training data lacked benchmarks. Can you help explain this? Thank you!

graph_matching.json in stage 1: Counter({'Industrial': 84003, 'arxiv': 74075, 'cora': 25120})

Arxiv-PubMed-mix-NC-LP.json in stage 2: Counter({'arxiv': 94441, 'pubmed': 73660})

What is the benchmark "Industrial"? I didn't see the name anywhere in the paper. And why graph_matching.json lacks PubMed but Arxiv-PubMed-mix-NC-LP.json doesn't have Cora?

HKUDS / GraphGPT

Training data issues #73