fani-lab / OpeNTF

Neural machine learning methods for Team Formation problem.
Other
18 stars 13 forks source link

Hyperparameter Study for neural models #179

Open VaghehDashti opened 1 year ago

VaghehDashti commented 1 year ago

Hello @hosseinfani, I created this issue to put the updates for the hyperparameter study of temporal team formation. I started the run with 2 layers of [64,128] on all models [bnn, bnn_emb, tbnn, tbnn_emb, tbnn_dt2v_emb] and on the three datasets (15 runs in total). I will run the models with 3 layers of [64,128,256] afterwards.

hosseinfani commented 1 year ago

@VaghehDashti doesn't make sense at all!! Or perfectly make sense. I have to talk to a statistician :)

Just in case, I believe you have bnn model saved for #bs={1,3,5,10} for imdb. Can you load them and run them on the same test set to draw these figures like toy? I know the x will be many experts.

VaghehDashti commented 1 year ago

@hosseinfani Sure. However I cannot make it for #bs=1 because the min-max-avg are the same thing. #bs = 3:

f2 test min-max-avg-plot

#bs = 5:

f2 test min-max-avg-plot

#bs = 10:

f2 test min-max-avg-plot

#bs = 20:

f2 test min-max-avg-plot

VaghehDashti commented 1 year ago

Please explain the reason behind the sudden drop in loss after the epoch here by mentioning the codline.

@hosseinfani, I believe the drop in loss is due to this line where we decrease the learning rate when the validation loss does not change significantly after 10 epochs. https://github.com/fani-lab/OpeNTF/blob/148c1c2defe1176563f162ad159b2ffe0af15ecc/src/mdl/bnn.py#L111

can you try running on patience=2 but same 20 epochs on imdb or any dataset which gives you results faster?

@hosseinfani, Here is the train/val loss for patience=2 on imdb with #bs=5: f2 train_valid_loss

with patience=10 on imdb with #bs=5: f2 train_valid_loss

VaghehDashti commented 1 year ago

@hosseinfani, Here are the results of bnn_emb on dblp, imdb, and uspt: imdb:

  | P_2 | P_5 | P_10 | rec_2 | rec_5 | rec_10 | ndcg_2 | ndcg_5 | ndcg_10 | map_2 | map_5 | map_10 | aucroc -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- bnn_emb #bs=1 | 0.0043 | 0.0051 | 0.0064 | 0.0028 | 0.0085 | 0.0196 | 0.0033 | 0.0059 | 0.0114 | 0.0014 | 0.0028 | 0.0044 | 0.5182 bnn_emb #bs=3 | 0.0021 | 0.0026 | 0.0038 | 0.0014 | 0.0043 | 0.0105 | 0.0026 | 0.0038 | 0.0069 | 0.0014 | 0.0022 | 0.0031 | 0.5264 bnn_emb #bs=5 | 0.0000 | 0.0009 | 0.0030 | 0.0000 | 0.0009 | 0.0088 | 0.0000 | 0.0006 | 0.0041 | 0.0000 | 0.0002 | 0.0011 | 0.5256 bnn_emb #bs=10 | 0.0021 | 0.0017 | 0.0043 | 0.0014 | 0.0023 | 0.0125 | 0.0026 | 0.0026 | 0.0072 | 0.0014 | 0.0016 | 0.0030 | 0.5371

dblp:

  | P_2 | P_5 | P_10 | rec_2 | rec_5 | rec_10 | ndcg_2 | ndcg_5 | ndcg_10 | map_2 | map_5 | map_10 | aucroc -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- bnn_emb #bs=1 | 0.0011 | 0.0013 | 0.0013 | 0.0007 | 0.0019 | 0.0037 | 0.0011 | 0.0016 | 0.0024 | 0.0005 | 0.0008 | 0.0010 | 0.6681 bnn_emb #bs=3 | 0.0020 | 0.0019 | 0.0018 | 0.0012 | 0.0027 | 0.0054 | 0.0021 | 0.0025 | 0.0037 | 0.0009 | 0.0013 | 0.0017 | 0.6656 bnn_emb #bs=5 | 0.0014 | 0.0011 | 0.0011 | 0.0008 | 0.0016 | 0.0032 | 0.0013 | 0.0015 | 0.0022 | 0.0006 | 0.0008 | 0.0010 | 0.6078

uspt:

  | P_2 | P_5 | P_10 | rec_2 | rec_5 | rec_10 | ndcg_2 | ndcg_5 | ndcg_10 | map_2 | map_5 | map_10 | aucroc -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- bnn_emb #bs=1 | 0.0011 | 0.0013 | 0.0013 | 0.0007 | 0.0019 | 0.0037 | 0.0011 | 0.0016 | 0.0024 | 0.0005 | 0.0008 | 0.0010 | 0.6681 bnn_emb #bs=3 | 0.0054 | 0.0045 | 0.0039 | 0.0028 | 0.0056 | 0.0094 | 0.0054 | 0.0056 | 0.0074 | 0.0021 | 0.0029 | 0.0034 | 0.6844 bnn_emb #bs=5 | 0.0039 | 0.0033 | 0.0028 | 0.0021 | 0.0042 | 0.0071 | 0.0039 | 0.0041 | 0.0055 | 0.0016 | 0.0022 | 0.0026 | 0.6479

We can see that for imdb increasing #bs leads to lower performance and the best performance is with #bs=1. but for dblp and uspt #bs=3 has the best performance but the difference is not significant in my opinion. the model is running on #bs=10 for dblp and uspt. I will update when they are ready. My guess is that model will have lower performance with #bs=10 on dblp and uspt. Right now that the model is acting differently on imdb and dblp/uspt, how should I proceed to the next hyperparameter (size of embeddings)? Should I use #bs=1 for all datasets since the performance of the model with #bs=3 is not significantly different on dblp/uspt?

hosseinfani commented 1 year ago

@VaghehDashti Yes, go with #bs=1