RUCAIBox / RecBole

A unified, comprehensive and efficient recommendation library
https://recbole.io/
MIT License
3.42k stars 612 forks source link

Questions about optimization and efficiency #1855

Open SergeyPetrakov opened 1 year ago

SergeyPetrakov commented 1 year ago

Hello! Thank you for a such a great tool!

I have several questions on topics optimization and efficiency:

  1. Do I correctly understand that sparse matrix of user - item interaction is always done (in order to store information efficiently) If I correctly understand it is done inside function _create_sparse_matrix() inside data/dataset/dataset.py and it is always used inside inter_matrix and this inter_matrix is used inside methods like in recbole/model/general_recommender/lightgcn.py

self.interaction_matrix = dataset.inter_matrix(form="coo").astype(np.float32)

  1. Do I correctly understand that it is during training (while a running a command below) multiprocessing is turned on?
python run_recbole.py --model=GENERAL_MODEL --dataset=FOLDER_OF_MY_DATASET --config_files=test.yaml

And if I correctly understands this part of code is responsible for multiprocessing over many cores of my CPU inside file run_recbole.py:

import torch.multiprocessing as mp

        mp.spawn(
            run_recboles,
            args=(
                args.model,
                args.dataset,
                config_file_list,
                args.ip,
                args.port,
                args.world_size,
                args.nproc,
                args.group_offset,
            ),
            nprocs=args.nproc,
        )

(I found that all cores of my CPU are loaded via htop command just for those who will try)

  1. This question is a little bit more complex to my mind - are there any easy way to integrate even more efficient things like ONNX / Jax / Jit to RecBole. Maybe you tried or can give me some advice how to implement it if it is not so difficult for you?

  2. Is there any opportunity to add lr sheduler and warm up procedure (that is very helpful for Adam for example)? If it is not so difficult please provide a code example

Hope you will answer me. Thank you!

Ethan-TZ commented 1 year ago

@SergeyPetrakov Thanks for your attention to RecBole!

  1. Yes.
  2. No. Multi-processing will only be initiated when you utilize the "--nproc" parameter.
  3. We have not tred such efficient things like ONNX / Jax / Jit , and perhaps those will be considered in subsequent development.
  4. We have not incorporated the lr_sheduler, I think you can achieve this by substituting the optimizer of the trainer with the lr_sheduler. The warm up function can be implemented by the resume_checkpoint function.
SergeyPetrakov commented 1 year ago

@chenyuwuxin, thank you very much! points 1 and 2 very nice.

in row

self.interaction_matrix = dataset.inter_matrix(form="coo").astype(np.float32)

there are several general advice what to use better:

I think in this case csr is better due to the 1 point, don't you think so?

according to point 4 - ok, I will try

Ethan-TZ commented 1 year ago

@SergeyPetrakov Thanks for your advice! Yes, we also believe that CSR is more efficient, and we will conduct detailed analysis and testing in the subsequent update process. Thank you.

SergeyPetrakov commented 1 year ago

@chenyuwuxin, one more question. Your answer "Multi-processing will only be initiated when you utilize the "--nproc" parameter." relates only to GPU, am I right? Since I do not have GPUs on my local machine (only CPU) I cannot apply multiprocessing on CPU this way, I checked this and received error while trying to run command:

python run_recbole.py --model=BPR --dataset=ml-100k --config_files=test.yaml --nproc=4