Config yaml file and Eval Batch Size

shahjaidev commented 2 years ago

Hello!

I'm training models on my custom data and have created a config file (picture attached above) which I use as:

python run_recbole.py --model=BPR --dataset=coclick_1d --config_file_list=['/home/core/shahjaidev/DirectAU/jaidev_configs/coclick_1d_config.yaml']

However, I notice that many of the config parameters I set in the yaml are not actually set in the actual training. For instance, min_user_inter_num remains set to 5 in the config printed in the terminal. And the split remains [0.8, 0.1, 0.1] despite the fact I set it differently , as in the screenshot.

Am I not formatting the config yaml correctly? Could you please suggest the fix?

It would be quite helpful if you could share a couple of entire config files used to train different models. (I'm sure many others could also benefit from this)

On a related note, I set the eval_batch_size to 10000 and despite this, evaluation is very slow (over 3 hours). I'm curious why evaluation is this slow and what could be done to speed it up. What is the denominator in the progress bar for eval, is it the number of users (seems unlikely)?

Thanks:) Great work with this library!

Ethan-TZ commented 2 years ago

@shahjaidev Hello, thanks for your attention to RecBole!

For the first question, current version of RecBole no longer uses config fields such as max_user_inter_num or min_user_inter_num, but uses user_inter_num_interval or item_inter_num_interval. An example file for movielens dataset is as follows:

# dataset config
field_separator: "\t"
seq_separator: " "
USER_ID_FIELD: user_id
ITEM_ID_FIELD: item_id
RATING_FIELD: rating
NEG_PREFIX: neg_
LABEL_FIELD: label
load_col:
    inter: [user_id, item_id, rating]
val_interval:
    rating: "[3,inf)"    
unused_col: 
    inter: [rating]
user_inter_num_interval: "[10,inf)"
item_inter_num_interval: "[10,inf)"

# training and evaluation
epochs: 500
train_batch_size: 4096
valid_metric: MRR@10

# model
embedding_size: 64

For the second question, it may be due to too many items, resulting in the slow evaluation mode of full. You can try to use pop100 mode to speed up the evaluation process. i.e., just set:

eval_args:
    mode: pop100

shahjaidev commented 2 years ago

Thanks for the answer!

How about the split in eval_args? I don't know why the split I set in the config file doesn't take effect.

Ethan-TZ commented 2 years ago

@shahjaidev The split takes the form of:

eval_args:
  split: {'RS':[0.95, 0.01, 0.04]}

Note that metrics is not a key of eval_args.

shahjaidev commented 2 years ago

Thanks for the reply! Actually, even after setting eval_args: split: {'RS':[0.95, 0.01, 0.04]} in the yaml file,

and train_batch_size: 500 in the yaml file, these are not reflected in the model training and eval.

This is the command: python run_recbole.py --model=BPR --dataset=coclick_1d --config_file_list=['/home/core/shahjaidev/DirectAU/jaidev_configs/coclick_1d_config.yaml']

Terminal Output:

shahjaidev commented 2 years ago

Also, for my understanding what is the denominator during evaluation progress bar? (For reference, number of users in the data is ~2 million)

Note: I'm using pop100 as the eval mode, and despite this evaluation is really slow

Ethan-TZ commented 2 years ago

@shahjaidev Hello, there is an error in your run command. i.e., we will not use config_file_list as the command line parameter, but config_files. Therefore, the correct command should be: python run_recbole.py --model=BPR --dataset=coclick_1d --config_files=/home/core/shahjaidev/DirectAU/jaidev_configs/coclick_1d_config.yaml

shahjaidev commented 2 years ago

moved to new issue

RUCAIBox / RecBole

Config yaml file and Eval Batch Size #1392