RUCAIBox / RecBole-GNN

Efficient and extensible GNNs enhanced recommender library based on RecBole.
MIT License
167 stars 37 forks source link

New dataset not being detected. Where to save the atomic files? #71

Closed nicksukie closed 11 months ago

nicksukie commented 11 months ago

Describe the bug

我按照这里清晰的说明添加新数据集到RecBole。我已经创建了原子文件,然后按照说明创建了datasetdataloader

但是当我运行新数据集时,出现错误:ValueError: Neither [dataset/data34452] exists in the device nor [data34452] a known dataset name

然而,数据集·data34452·绝对存在。我应该将原子文件保存在哪个位置,以便可以使用RecBole运行我的数据?我已经尝试将其从个人目录移动到RecBole 的Python包目录,但没有成功。

image

To Reproduce YAML file:


data_path: /data/nicholas/abc/RecBole-GNN-main/recbole_gnn/data/
dataset: data34452

USER_ID_FIELD: user_id
ITEM_ID_FIELD: item_id
RATING_FIELD: rating
TIME_FIELD: timestamp

load_col:
    inter: [user_id, item_id, rating, timestamp]
    user: [user_id]
    item: [item_id, category_id, category_level]

eval_args:
    split: {'RS': [8,1,1]}
    group_by: user
    order: RO
  1. your code Code for creating the new dataset:
modelname = 'LightGCN'
dataset='data34452'
yaml_path = '/data/nicholas/abc/RecBole-GNN-main/recbole_gnn/data/data34452/data34452.yaml'

if __name__ == '__main__':

    config = Config(model=modelname, dataset='data34452',config_file_list=[yaml_path])
    dataset = create_dataset(config)
    train_data, valid_data, test_data = data_preparation(config, dataset)
  1. script for running

我只需使用·run_recbole_gnn.py·文件在新数据集上运行RecBole,命令为 python run_recbole_gnn.py -m LightGCN -d data34452, 但是我得到了错误: ValueError: Neither [dataset/data34452] exists in the device nor [data34452] a known dataset name.

我正在使用一台Linux机器,PyTorch 2.0,Python 3.11和RecBole 1.1.1。

感谢您帮助我解决这个问题。

nicksukie commented 11 months ago

The solution was changing the path in the .yaml file to a relative path within the RecBole-GNN repo.

hyp1231 commented 11 months ago

Thanks and glad to see that works!