LLM's prediction acc - Githubissues

pricexu commented 1 month ago

Hi Xiaoxin,

Thanks for this interesting work. I was checking the acc of the LLM's prediction using your provided data in "gpt_preds" However, the acc I tested is different from what you replied. My following piece of code is used, where I copied your load_gpt_preds function:

def load_gpt_preds(folder, dataset):
    preds = []
    fn = f"{folder}/{dataset}.csv"
    with open(fn, 'r') as file:
        reader = csv.reader(file)
        for row in reader:
            inner_list = []
            for value in row:
                inner_list.append(int(value))
            preds.append(inner_list)
    return preds

folder = "/MY_FOLDER/TAPE/gpt_preds"
datasets = ['cora', 'pubmed', 'ogbn-arxiv', 'ogbn-products', 'arxiv_2023']
for dataset in datasets:
    _, data = load_data(dataset, 'MY_ROOT_FOLDER') # this is the loading data function from OGB
    gt = data.y.flatten().tolist()
    for topk in range(1, 4):
        preds = load_gpt_preds(folder, dataset)
        preds = [x[:topk] for x in preds]
        assert len(preds) == len(gt)
        acc = sum([1 for i in range(len(gt)) if gt[i] in preds[i]]) / len(gt)
        print(f"Dataset: {dataset}, Top-{topk} Accuracy: {acc:.4f}")

and what I see is

Dataset: cora, Top-1 Accuracy: 0.1721
Dataset: cora, Top-2 Accuracy: 0.2430
Dataset: cora, Top-3 Accuracy: 0.2489
Dataset: pubmed, Top-1 Accuracy: 0.3645
Dataset: pubmed, Top-2 Accuracy: 0.3965
Dataset: pubmed, Top-3 Accuracy: 0.3980
Dataset: ogbn-arxiv, Top-1 Accuracy: 0.7350
Dataset: ogbn-arxiv, Top-2 Accuracy: 0.8756
Dataset: ogbn-arxiv, Top-3 Accuracy: 0.9240
Dataset: ogbn-products, Top-1 Accuracy: 0.7440
Dataset: ogbn-products, Top-2 Accuracy: 0.8513
Dataset: ogbn-products, Top-3 Accuracy: 0.8921
Dataset: arxiv_2023, Top-1 Accuracy: 0.7356
Dataset: arxiv_2023, Top-2 Accuracy: 0.8830
Dataset: arxiv_2023, Top-3 Accuracy: 0.9289

I think the acc on ogbn-arxiv matches what you reported, but the accs on Cora and Pubmed are much lower than the numbers you said 0.6769 and 0.9342.

I am wondering if the file "cora.csv" and "pubmed.csv" is wrong , could you help have a very quick check? I would greatly appreciate that!

Zhe

pricexu commented 1 month ago

In addition, I saw this question shown solved here #11 but after I changed the code to the format as #11, the results are still the same (~0.16 on acc and ~0.36 on pubmed). I noticed that the accs are only problematic on the Cora and Pubmed so maybe the node orders in "cora.csv" and "pubmed.csv" are not the original orders from the OGB package?

XiaoxinHe commented 1 month ago

Hi Zhe,

The mismatch might be due to the data loader. For Cora and PubMed, you should use the load_data function we provide: https://github.com/XiaoxinHe/TAPE/blob/26f1e43b6aa9de39a8f68ab79d1f2b607d8baf01/core/data_utils/load.py#L26, as the data ordering is not the same as in the original order from the OGB package.

Please try replacing the data loader, and feel free to let me know if that doesn't work. Thanks!

pricexu commented 1 month ago

Thank you, Xiaoxin! It helps. :-)

XiaoxinHe / TAPE

LLM's prediction acc #24