Closed jopetty closed 3 years ago
Okay, I think(?) I have a thought on this. I think there are 46 regular tokens in the dataset, with the possible addition of <sos>
and <eos>
tokens. I think in the old version, the source field had neither while the target field had both. Here, I think they both have the <eos>
but not the <sos>
. Adding in an <sos>
token to both raises the size to 48 for both:
TransductionModel(
(_encoder): SequenceEncoder(
(_embedding): Embedding(48, 256)
(_dropout): Dropout(p=0, inplace=False)
(_unit): GRU(256, 256)
)
(_decoder): SequenceDecoder(
(_embedding): Embedding(48, 256)
(_dropout): Dropout(p=0, inplace=False)
(_unit): GRU(256, 256)
(_out): Linear(in_features=256, out_features=48, bias=True)
)
)
The other part was the transform
token, which is included in one or the other.
An old model:
versus a new one:
Why are the input embedding dimensions different?