Closed glicerico closed 3 years ago
Hi @glicerico ,
Due to limited compute (and time) I have trained it on the switchboard dataset
for 5 epochs only and I was getting 0.62 accuracy
. Also, the data that I have used has 53 classes and the original paper has 43 classes.
I have used this model as baseline for my research (we have our own data) but if you're interested in training it on switchboard data
I have prepared a minimal kaggle kernel CASA-Dialogue-Act-Classifier. Feel free to reach out to me if you face any problem during training.
PS: Due to lack of parallelization (uses the context of dialogue history) training is extremely slow, it takes >2hrs/epoch on my local machine and ~1h/epoch on kaggle compute. More compute (many GPU's) may not be helpful for faster training
Hi @macabdul9, thanks for the kaggle kernel to run the model, it has been useful! And thanks for sharing the scores in your experiments. Running your model as is, I also get a similar score (0.642), but I think there may be some problems, which I comment in new issues I created.
Just as a comment, running with GPU's on Colab is around 70 times faster for me than running on CPU's (in my own computer or in Colab as well).
Thanks for the implementation! What scores did you achieve in the SwDA dataset? Do you reach the original paper's result?