hello-dx / POWERec

pytorch implementation for Prompt-based and Weak-modality Enhanced Multimodal Recommendation
6 stars 0 forks source link

About experiment result in your paper #1

Open Jinfeng-Xu opened 10 months ago

Jinfeng-Xu commented 10 months ago

I use the MMRec framework and the best parameter settings to reproduce your paper, but the best results for Baby dataset only reach 0.0498 on recall@10, and your paper shows the result is 0.0545 for this metric. Can you give more detail about the parameters setting?

Here is the final results for POWERec I reproduced recently. █████████████ BEST ████████████████ Parameters: ['dropout', 'reg_weight', 'neg_weight', 'prompt_num', 'seed']=(0.2, 1, 0.01, 1, 999),

Valid: recall@5: 0.0297 recall@10: 0.0482 recall@20: 0.0790 recall@50: 0.1399 ndcg@5: 0.0195 ndcg@10: 0.0255 ndcg@20: 0.0333 ndcg@50: 0.0455 precision@5: 0.0063 precision@10: 0.0051 precision@20: 0.0042 precision@50: 0.0030 map@5: 0.0159 map@10: 0.0183 map@20: 0.0204 map@50: 0.0223 ,

Test: recall@5: 0.0294 recall@10: 0.0498 recall@20: 0.0795 recall@50: 0.1404 ndcg@5: 0.0189 ndcg@10: 0.0256 ndcg@20: 0.0332 ndcg@50: 0.0455 precision@5: 0.0065 precision@10: 0.0055 precision@20: 0.0044 precision@50: 0.0031 map@5: 0.0149 map@10: 0.0176 map@20: 0.0196 map@50: 0.0215

Hope to get your detailed parameters setting, which reach the highest results as your paper shows.

hello-dx commented 10 months ago

The followings are my saved logs. Parameters: ['n_layers', 'reg_weight', 'prompt_num', 'dropout', 'neg_weight', 'seed']=(4, 1, 3, 0.2, 0.01, 999), Valid: recall@5: 0.0334 recall@10: 0.0539 recall@20: 0.0832 recall@50: 0.1425 ndcg@5: 0.0216 ndcg@10: 0.0283 ndcg@20: 0.0357 ndcg@50: 0.0476 , Test: recall@5: 0.0341 recall@10: 0.0545 recall@20: 0.0823 recall@50: 0.1445 ndcg@5: 0.0232 ndcg@10: 0.0299 ndcg@20: 0.0370 ndcg@50: 0.0496

Different environments could lead different best hyper-parameters. You can try to search the hyper-parameters by the yaml file. An example is as follows: """ embedding_size: 64 reg_weight: [0.1, 1] neg_weight: [0.001, 0.01, 0.1, 1] dropout: [0.2, 0.1] prompt_num: [1, 2, 3, 4, 5] hyper_parameters: ["reg_weight", "prompt_num", "dropout", "neg_weight"] """

I will check the released codes and run the codes as soon.

Jinfeng-Xu commented 10 months ago

Many thanks for your explanation.

I will reproduce it soon. I noticed the promot_num settings comparison for different datasets in your paper. Interesting and reasonable. Thanks again for your explanation.

Jinfeng-Xu commented 10 months ago

█████████████ BEST ████████████████ Parameters: ['dropout', 'reg_weight', 'neg_weight', 'prompt_num', 'seed']=(0.2, 1, 0.01, 5, 999),

Valid: recall@5: 0.0293 recall@10: 0.0484 recall@20: 0.0786 recall@50: 0.1381 ndcg@5: 0.0186 ndcg@10: 0.0248 ndcg@20: 0.0325 ndcg@50: 0.0444 precision@5: 0.0061 precision@10: 0.0051 precision@20: 0.0041 precision@50: 0.0029 map@5: 0.0149 map@10: 0.0174 map@20: 0.0195 map@50: 0.0213 ,

Test: recall@5: 0.0297 recall@10: 0.0508 recall@20: 0.0805 recall@50: 0.1408 ndcg@5: 0.0195 ndcg@10: 0.0265 ndcg@20: 0.0341 ndcg@50: 0.0463 precision@5: 0.0066 precision@10: 0.0056 precision@20: 0.0045 precision@50: 0.0031 map@5: 0.0157 map@10: 0.0185 map@20: 0.0205 map@50: 0.0224

Here are the results for the yaml file you provided. Maybe the model isn't very stable? Maybe prompt is not always well learned in some cases?

hello-dx commented 10 months ago

I have updated the codes. I run the codes in an new envirovement and the best results are as follows. 23 Oct 08:59 INFO Parameters: ['reg_weight', 'prompt_num', 'dropout', 'neg_weight', 'seed']=(0.1, 3, 0.2, 1, 999), Valid: recall@5: 0.0325 recall@10: 0.0527 recall@20: 0.0849 recall@50: 0.1483 ndcg@5: 0.0214 ndcg@10: 0.0280 ndcg@20: 0.0361 ndcg@50: 0.0487 , Test: recall@5: 0.0335 recall@10: 0.0545 recall@20: 0.0834 recall@50: 0.1471 ndcg@5: 0.0223 ndcg@10: 0.0292 ndcg@20: 0.0366 ndcg@50: 0.0496