reczoo / BARS

BARS: Towards Open Benchmarking for Recommender Systems https://openbenchmark.github.io/BARS
Apache License 2.0
350 stars 57 forks source link

Is it normal that I cannot exactly reproduce the results? #36

Open ywangwxd opened 1 month ago

ywangwxd commented 1 month ago

I am trying to reproduce the results of DeepFM on criteo_4x_001 dataset. I have setup my enviroment as follows:

python             3.6.13
cuda                11.7
torch               1.10.2
fuxictr             1.0.2
h5py                3.1.0
numpy               1.19.5
pandas              1.1.5
scipy               1.5.4

Is is not exactly the same as the environment in this repo, but at least I have set up fuxictr version 1.0.2 exactly.
Then I followed the config as https://github.com/reczoo/BARS/tree/main/ranking/ctr/DeepFM/DeepFM_criteo_x4_001

The results were slightly differerent. I noticed that the AUC results in the original exepreiment had a big jump from 0.809407 to 0.813303 at epoch 4 to 5. My results also had such kind of a jump, but it came later at epoch 8 to 9, where AUC jump from 0.809898 to 0.813443.

zhujiem commented 1 month ago

Yes, the result is quite normal. You cannot expect to run the same result within an different environment. Runing on GPUs is always non-deterministic.