KeyError - Githubissues

CodeSniperYang commented 3 months ago

Hi, hello! Thank you very much for providing the source code for our reference and study.

In a recent study, we planned to use your BIGCF as SOTA, but an error occurred during runtime. We tried to change the dataset but still did not solve the problem. The dataset follows your style format: coo_matrix. Dataset source: "SSLRec: A Self-Supervised Learning Framework for Recommendation"

Below is the error message. It seems that there is an error in the key value. Do you have any good solution?

already create adjacency matrix (45304, 45304) 0.24800324440002441 Start Training Sampling Data: 100%|██████████| 409600/409600 [00:03<00:00, 104187.04it/s] 100%|██████████| 40/40 [00:05<00:00, 7.78it/s] Traceback (most recent call last): File "D:\Yustinian\MODEL\BIGCF-main\main.py", line 131, in test_ret = eval_PyTorch(_model, data_generator, eval(args.Ks)) File "D:\Yustinian\MODEL\BIGCF-main\utility\batch_test.py", line 96, in eval_PyTorch train_items = list(data_generator.train_items[user_batch[i]]) KeyError: 16889

Process finished with exit code 1

BlueGhostYi commented 3 months ago

Hi, hello! Thank you very much for providing the source code for our reference and study.

In a recent study, we planned to use your BIGCF as SOTA, but an error occurred during runtime. We tried to change the dataset but still did not solve the problem. The dataset follows your style format: coo_matrix. Dataset source: "SSLRec: A Self-Supervised Learning Framework for Recommendation"

Below is the error message. It seems that there is an error in the key value. Do you have any good solution?

already create adjacency matrix (45304, 45304) 0.24800324440002441 Start Training Sampling Data: 100%|██████████| 409600/409600 [00:03<00:00, 104187.04it/s] 100%|██████████| 40/40 [00:05<00:00, 7.78it/s] Traceback (most recent call last): File "D:\Yustinian\MODEL\BIGCF-main\main.py", line 131, in test_ret = eval_PyTorch(_model, data_generator, eval(args.Ks)) File "D:\Yustinian\MODEL\BIGCF-main\utility\batch_test.py", line 96, in eval_PyTorch train_items = list(data_generator.train_items[user_batch[i]]) KeyError: 16889

Process finished with exit code 1

Hi, Thank you very much for your interest in our work. Very sorry for any errors due to our negligence. Based on your description, the possible reason is that the dataset you used lacked the user #16889 (or the user was present in the test set but not in the training set).

You can try checking the original dataset to see if the number of users in the training set and the test set are the same. You are also encouraged to provide the acquisition address of the dataset you used so that we can dig deeper to locate the problem.

Additionally, if your experiments use the framework consistent with classic work such as LightGCN, you are also welcome to use another version of BIGCF (https://github.com/BlueGhostYi/BIGCF-full-sample).

I hope the above response is helpful and thank you again for your interest and understanding. If you still have any doubts, please do not hesitate to contact me at any time.

CodeSniperYang commented 3 months ago

Thank you for your help.

My initial guess for the input in coo_matrix format was that when sampling the training set data, if a certain ID does not exist, it will cause an error. But I tried inputting another similar dataset into BIGCF-full-sample and got good results.

When I have free time (maybe tomorrow), I will convert the data file of coo_matrix into a. txt file format [userid, [item1, item2, item3]].I will use this format file as the BIGCF-full-sample dataset to view the results.

Thank you again for your help

BlueGhostYi commented 3 months ago

Thanks for your response. Please note that BIGCF-full-sample adopts a different sampling strategy from the paper, so the hyperparameter settings need to be adjusted accordingly, which can be found in the readme.md of BIGCF-full-sample. If you still have any question, please do not hesitate to contact me at any time.

CodeSniperYang commented 3 months ago

Hello, it has been 5 days since our last communication. Today I converted the coo-matrix dataset to a normal txt format and tried the code(BIGCF-full-sample). I am very happy that the model is runnable and I am currently adjusting the hyperparameters. Thanks again for your work and I'm excited to use BIGCF as SOTA in my paper.

Here is the code converted coo-matrix to txt: (there is a warning, but it's insignificant.)

name = 'dataset-name'
file_path = 'D:\\Yustinian\\MODEL\\BIGCF-full-sample-main\\dataset\\'
with open(file_path + name + '\\train_mat.pkl', 'rb') as file:
    coo = pickle.load(file)

rows = coo.row
cols = coo.col

user_items = {}
for row, col in zip(rows, cols):
    if row not in user_items:
        user_items[row] = []
    user_items[row].append(col)

with open(file_path + name + '\\train.txt', 'w') as file:
    for user, items in user_items.items():
        file.write(f"{user} {' '.join(map(str, items))}\n")

BlueGhostYi / BIGCF

KeyError #2