Why not set batch_first = True in the LSTM model(pytorch)?

graykode / nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers

https://www.reddit.com/r/MachineLearning/comments/amfinl/project_nlptutoral_repository_who_is_studying/

MIT License

14.07k stars 3.91k forks source link

Why not set batch_first = True in the LSTM model(pytorch)? #13

Closed jotline closed 5 years ago

graykode commented 5 years ago

@jotline Hello. Could you pick code in line by line! In here, https://github.com/graykode/nlp-tutorial/blob/master/3-2.TextLSTM/TextLSTM-Torch.py right? :D

jotline commented 5 years ago

@graykode Hello. Thanks for your reply. I am just curious as to why all the LSTM models you use do not use the batch_first parameter. Instead, use the permute and transpose functions to swap the tensor's dimensions.

Is it because of performance considerations or just your habit? (After all, using the batch_first parameter can make the code shorter. and Better readability?)

graykode commented 5 years ago

@jotline Hello again. That's what I did when I first implemented it, so I left it like. It's not my habit. Thanks for giving issue, may someone help if read issue.