RUCAIBox / RecBole-GNN

Efficient and extensible GNNs enhanced recommender library based on RecBole.
MIT License
167 stars 37 forks source link

关于运行SRGNN在douban和yoochoose数据下评价指标为0 #82

Open starletbb opened 5 months ago

starletbb commented 5 months ago

作者您好,关于上述问题的运行结果如下: General Hyper Parameters: gpu_id = 0 use_gpu = True seed = 2020 state = INFO reproducibility = True data_path = dataset/douban checkpoint_dir = saved show_progress = True save_dataset = False dataset_save_path = None save_dataloaders = False dataloaders_save_path = None log_wandb = False

Training Hyper Parameters: epochs = 500 train_batch_size = 4096 learner = adam learning_rate = 0.001 train_neg_sample_args = {'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0} eval_step = 1 stopping_step = 10 clip_grad_norm = None weight_decay = 0.0 loss_decimal_place = 4

Evaluation Hyper Parameters: eval_args = {'split': {'LS': 'valid_and_test'}, 'order': 'TO', 'group_by': 'user', 'mode': {'valid': 'full', 'test': 'full'}} repeatable = True metrics = ['Recall', 'MRR', 'NDCG', 'Hit', 'Precision'] topk = [10] valid_metric = MRR@10 valid_metric_bigger = True eval_batch_size = 4096 metric_decimal_place = 4

Dataset Hyper Parameters: field_separator =
seq_separator =
USER_ID_FIELD = user_id ITEM_ID_FIELD = item_id RATING_FIELD = rating TIME_FIELD = timestamp seq_len = None LABEL_FIELD = label threshold = None NEGPREFIX = neg load_col = {'inter': ['user_id', 'item_id', 'rating', 'timestamp', 'likes_num']} unload_col = None unused_col = None additional_feat_suffix = None rm_dup_inter = None val_interval = None filter_inter_by_user_or_item = True user_inter_num_interval = [0,inf) item_inter_num_interval = [0,inf) alias_of_user_id = None alias_of_item_id = None alias_of_entity_id = None alias_of_relation_id = None preload_weight = None normalize_field = None normalize_all = None ITEM_LIST_LENGTH_FIELD = item_length LIST_SUFFIX = _list MAX_ITEM_LIST_LENGTH = 50 POSITION_FIELD = position_id HEAD_ENTITY_ID_FIELD = head_id TAIL_ENTITY_ID_FIELD = tail_id RELATION_ID_FIELD = relation_id ENTITY_ID_FIELD = entity_id benchmark_filename = None

Other Hyper Parameters: worker = 0 wandb_project = recbole shuffle = True require_pow = False enable_amp = False enable_scaler = False transform = None embedding_size = 64 step = 1 loss_type = CE numerical_features = [] discretization = None kg_reverse_r = False entity_kg_num_interval = [0,inf) relation_kg_num_interval = [0,inf) MODEL_TYPE = ModelType.SEQUENTIAL gnn_transform = sess_graph training_neg_sample_num = 0 eval_setting = TO_LS,full MODEL_INPUT_TYPE = InputType.POINTWISE eval_type = EvaluatorType.RANKING single_spec = True local_rank = 0 device = cuda valid_neg_sample_args = {'distribution': 'uniform', 'sample_num': 'none'} test_neg_sample_args = {'distribution': 'uniform', 'sample_num': 'none'}

C:\Users\HP\anaconda3\envs\pytorch-gpu\lib\site-packages\recbole\data\dataset\dataset.py:648: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.

feat[field].fillna(value=0, inplace=True) C:\Users\HP\anaconda3\envs\pytorch-gpu\lib\site-packages\recbole\data\dataset\dataset.py:650: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.

feat[field].fillna(value=feat[field].mean(), inplace=True) 20 Mar 16:34 INFO douban The number of users: 738701 Average actions of users: 2.8767510491403816 The number of items: 29 Average actions of items: 75894.85714285714 The number of inters: 2125056 The sparsity of the dataset: 90.08018222481785% Remain Fields: ['user_id', 'item_id', 'rating', 'timestamp', 'likes_num'] 20 Mar 16:36 INFO Constructing session graphs. 100%|██████████| 1038965/1038965 [01:56<00:00, 8952.76it/s] 20 Mar 16:38 INFO Constructing session graphs. 100%|██████████| 145071/145071 [00:15<00:00, 9590.91it/s] 20 Mar 16:38 INFO Constructing session graphs. 100%|██████████| 202320/202320 [00:23<00:00, 8537.58it/s] 20 Mar 16:38 INFO SessionGraph Transform in DataLoader. 20 Mar 16:38 INFO SessionGraph Transform in DataLoader. 20 Mar 16:38 INFO SessionGraph Transform in DataLoader. 20 Mar 16:38 INFO [Training]: train_batch_size = [4096] negative sampling: [{'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0}] 20 Mar 16:38 INFO [Evaluation]: eval_batch_size = [4096] eval_args: [{'split': {'LS': 'valid_and_test'}, 'order': 'TO', 'group_by': 'user', 'mode': {'valid': 'full', 'test': 'full'}}] 20 Mar 16:38 INFO SRGNN( (item_embedding): Embedding(29, 64, padding_idx=0) (gnncell): SRGNNCell( (incomming_conv): SRGNNConv( (lin): Linear(in_features=64, out_features=64, bias=True) ) (outcomming_conv): SRGNNConv( (lin): Linear(in_features=64, out_features=64, bias=True) ) (lin_ih): Linear(in_features=128, out_features=192, bias=True) (lin_hh): Linear(in_features=64, out_features=192, bias=True) ) (linear_one): Linear(in_features=64, out_features=64, bias=True) (linear_two): Linear(in_features=64, out_features=64, bias=True) (linear_three): Linear(in_features=64, out_features=1, bias=False) (linear_transform): Linear(in_features=128, out_features=64, bias=True) (loss_fct): CrossEntropyLoss() ) Trainable parameters: 64064 Train 0: 100%|████████████████████████| 254/254 [03:50<00:00, 1.10it/s, GPU RAM: 0.44 G/2.00 G] 20 Mar 16:42 INFO epoch 0 training [time: 230.78s, train loss: 651.7647] Evaluate : 100%|██████████████████████████| 36/36 [00:14<00:00, 2.54it/s, GPU RAM: 0.44 G/2.00 G] C:\Users\HP\anaconda3\envs\pytorch-gpu\lib\site-packages\recbole\evaluator\base_metric.py:78: RuntimeWarning: Mean of empty slice. avg_result = value.mean(axis=0) C:\Users\HP\anaconda3\envs\pytorch-gpu\lib\site-packages\numpy\core_methods.py:184: RuntimeWarning: invalid value encountered in divide ret = um.true_divide( 20 Mar 16:42 INFO epoch 0 evaluating [time: 14.26s, valid_score: nan] 20 Mar 16:42 INFO valid result: recall@10 : nan mrr@10 : nan ndcg@10 : nan hit@10 : nan precision@10 : nan Train 1: 100%|████████████████████████| 254/254 [02:55<00:00, 1.45it/s, GPU RAM: 0.44 G/2.00 G] 20 Mar 16:45 INFO epoch 1 training [time: 175.20s, train loss: 571.7500] Evaluate : 100%|██████████████████████████| 36/36 [00:09<00:00, 3.79it/s, GPU RAM: 0.44 G/2.00 G] 20 Mar 16:46 INFO epoch 1 evaluating [time: 9.54s, valid_score: nan] 20 Mar 16:46 INFO valid result: recall@10 : nan mrr@10 : nan ndcg@10 : nan hit@10 : nan precision@10 : nan Train 2: 100%|████████████████████████| 254/254 [02:54<00:00, 1.46it/s, GPU RAM: 0.44 G/2.00 G] 20 Mar 16:48 INFO epoch 2 training [time: 174.05s, train loss: 562.1718] Evaluate : 100%|██████████████████████████| 36/36 [00:08<00:00, 4.31it/s, GPU RAM: 0.44 G/2.00 G] 20 Mar 16:49 INFO epoch 2 evaluating [time: 8.39s, valid_score: nan] 20 Mar 16:49 INFO valid result: recall@10 : nan mrr@10 : nan ndcg@10 : nan hit@10 : nan precision@10 : nan

如何复现 yaml文件如下:

model config

embedding_size: 64 step: 1 loss_type: 'CE' gnn_transform: sess_graph

dataset config

field_separator: "\t" #指定数据集field的分隔符 seq_separator: " " #指定数据集中token_seq或者float_seq域里的分隔符 USER_ID_FIELD: user_id #指定用户id域 ITEM_ID_FIELD: item_id #指定物品id域 RATING_FIELD: rating #指定打分rating域 TIME_FIELD: timestamp #指定时间域 NEGPREFIX: neg #指定负采样前缀 LABEL_FIELD: label #指定标签域 ITEM_LIST_LENGTH_FIELD: item_length #指定序列长度域 LIST_SUFFIX: _list #指定序列前缀 MAX_ITEM_LIST_LENGTH: 50 #指定最大序列长度 POSITION_FIELD: position_id #指定生成的序列位置id

指定从什么文件里读什么列,这里就是从ml-1m.inter里面读取user_id, item_id, rating, timestamp这四列,剩下的以此类推

load_col: inter: [user_id, item_id, rating, timestamp,likes_num]

training settings

epochs: 500 #训练的最大轮数 train_batch_size: 4096 #训练的batch_size learner: adam #使用的pytorch内置优化器 learning_rate: 0.001 #学习率 training_neg_sample_num: 0 #负采样数目 eval_step: 1 #每次训练后做evalaution的次数 stopping_step: 10 #控制训练收敛的步骤数,在该步骤数内若选取的评测标准没有什么变化,就可以提前停止了

evalution settings

eval_setting: TO_LS,full #对数据按时间排序,设置留一法划分数据集,并使用全排序 metrics: ["Recall", "MRR","NDCG","Hit","Precision"] #评测标准 valid_metric: MRR@10 #选取哪个评测标准作为作为提前停止训练的标准 eval_batch_size: 4096 #评测的batch_size

**实验环境(请补全下列信息