Closed karan121bhukar closed 2 years ago
Hi, you may try model v7, learning rate=4e-6/6e-6, batch size = 2, max utterance length=50, max seq length=128, and you may try 2/3/4 for epoch numbers. You can also contact me by email prompt response.
Dear Authors, I have tried reproducing your reported results of paper (Structural Characterization for Dialogue Disentanglement) and have followed the hyperparameters provided by you in GitHub issue [LINK] and faced following issues: There was no parameter called ( max utterance length = 50) but I assumed you must be referring to argument ( max_previous_utterance = 50 ) instead. I tried 3 Learning rates = {4e-6, 5e-6, 6e-6} with model v7, batch size = 2, max seq length=128 and max_previous_utterance = 50 and ran for 9 epochs and tested the model in for each epoch on test set and following are the best results I was able to reproduce
LR. VI. ARI. P. R. F. 4e-6 92.63 65.86 42.69 46.29 44.4 5e-6 93.43 71.25 43.83 47.31 45.5 6e-6 93.66 72.5 44.18 48.59 46.3
As training progresses I observed in file eval_results.txt, Training loss keeps on decreasing and Loss on test set keeps on increasing as well.
Epoch. Train loss. Test loss 0 0.9122764717761299 0.8731647032839944 1 0.718476283301304 0.8133490042780177 2 0.6354380424937421 0.9265543161186303 3 0.5527335032271061 1.2161007103780916 4 0.4688017617632348 1.5410500765521857 5 0.38050194026046946 1.8817651102929587 6 0.300836969023385 2.3393361151842615 7 0.24707656121400562 2.4975450912934187 8 0.19891027118409488 2.6616663114180663 9 0.17213401655327798 2.891134002466313
Please address these issue ASAP, and let me know if I’m doing something wrong in reproducing the results. I’ll be very grateful if you can help me in reproducing the results.
Hi authors, i have been trying to reproduce the results but not been able to reproduce them, can you provide me the exact hyper parameters used for training, and number of epoch you trained to produce results.