Closed shahhaard47 closed 4 years ago
Also, I probably missed something in the paper, but why don't you include bias in any Linear layers? You do:
nn.Linear(..., bias=False)
Also, I probably missed something in the paper, but why don't you include bias in any Linear layers? You do:
If you read the paper closely, you'll see that we usually only need a weight matrix and not a linear layer per se. A linear layer acts as a weight matrix if you make bias=False
. Alternatively, you could have also used the nn.Parameter()
to initialize the weight matrix, so that it gets added to the list of model parameters. I think using nn.Linear
reduces some boilerplate code.
This is at the end of the BiDAF.forward
I don't use softmax in the model because I use nn.CrossEntropy
to calculate the loss which takes care of all that internally.
Thank you so much for the response!
This is at the end of the BiDAF.forward