wxl1999 / UniCRS

[KDD22] Official PyTorch implementation for "Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning".
MIT License
25 stars 18 forks source link

How to reproduce the performance on the ReDial dataset? #8

Open dandyxxxx opened 1 year ago

dandyxxxx commented 1 year ago

I trained according to the code provided on GitHub, but since the dataset link you provided cannot be opened, I used mapping based objects Lang=en 202112.ttl dataset. The final results of my training are as follows:

conv: 'test/dist@2': 0.310709750246931, 'test/dist@3': 0.49851841399746016, 'test/dist@4': 0.6383519119514605 rec: 'test/recall@1': 0.029324894514767934, 'test/recall@10': 0.16729957805907172, 'test/recall@50': 0.37953586497890296

(1)These results differ greatly from the results presented in the paper. Can you give me some guidance? I hope to reproduce results similar to yours. Thank you very much. (2)According to your paper, do I need to set --n_prefix_conv 50 in the train_conv.py and --use_resp in the train_rec. py?

linshan-79 commented 11 months ago

I also encountered the same problem as you. Since the author provided the missing files, I trained according to the guidance. However the metric results also similar to yours. Here are the detail of redial dataset metrics:

conv:

'test/dist@2': 0.26710879074361504, 'test/dist@3': 0.4199238041484408, 'test/dist@4': 0.5233526174686045,

rec:

'test/recall@1': 0.035443037974683546, 'test/recall@10': 0.1729957805907173, 'test/recall@50': 0.3744725738396624, 

Here is my config of conversational task:

accelerate launch train_conv.py \
           --dataset redial \
           --tokenizer ~/model/DialoGPT-small \
           --model ~/model/DialoGPT-small \
           --text_tokenizer ~/model/roberta-base \
           --text_encoder ~/model/roberta-base \
           --n_prefix_conv 50 \
           --prompt_encoder ${prompt_encoder_dir}/final \
           --num_train_epochs 10 \
           --gradient_accumulation_steps 1 \
           --ignore_pad_token_for_loss \
           --per_device_train_batch_size 8 \
           --per_device_eval_batch_size 16 \
           --num_warmup_steps 6345 \
           --context_max_length 200 \
           --resp_max_length 183 \
           --prompt_max_length 200 \
           --entity_max_length 32 \
           --learning_rate 1e-4 \
           --output_dir ${output_dir} \
           --log_all

(1)@dandyxxxx, you can see that I set 'n_prefix_conv=50,' but the results don't match the paper. Could you share your configuration details? Maybe we can work together to solve the problem. Thank you very much! (2)@wxl1999 Thanks for your work! I learned a lot from your paper and code as a beginner. I'm thinking the issue might be related to the kg module not correctly capturing relations from the dataset. Could you provide some guidance? Thank you very much!

wxl1999 commented 11 months ago

Sorry for the late reply!

Hope this can help you!

linshan-79 commented 11 months ago

Thanks for your replying! This help me a lot.

careerists commented 10 months ago

@linshan-79 I have the same problem. Did you finally solve it? Thank you so much.