clay-lab / transductions

A PyTorch framework for creating, running, and reproducing experiments on seq2seq models.
6 stars 3 forks source link

Why are the embedding dimensions different between the old and new versions? #28

Closed jopetty closed 3 years ago

jopetty commented 3 years ago

An old model:

Seq2Seq(
  (encoder): EncoderRNN(
    (dropout): Dropout(p=0.0, inplace=False)
    (embedding): Embedding(46, 256)
    (rnn): RNN(256, 256)
  )
  (decoder): DecoderRNN(
    (embedding): Embedding(48, 256)
    (dropout): Dropout(p=0.0, inplace=False)
    (out): Linear(in_features=256, out_features=48, bias=True)
    (rnn): RNN(256, 256)
  )
)

versus a new one:

TransductionModel(
  (_encoder): SequenceEncoder(
    (_embedding): Embedding(47, 256)
    (_dropout): Dropout(p=0, inplace=False)
    (unit): RNN(256, 256)
  )
  (_decoder): SequenceDecoder(
    (_embedding): Embedding(47, 256)
    (_dropout): Dropout(p=0, inplace=False)
    (out): Linear(in_features=256, out_features=47, bias=True)
    (unit): RNN(256, 256)
  )
)

Why are the input embedding dimensions different?

jopetty commented 3 years ago

Okay, I think(?) I have a thought on this. I think there are 46 regular tokens in the dataset, with the possible addition of <sos> and <eos> tokens. I think in the old version, the source field had neither while the target field had both. Here, I think they both have the <eos> but not the <sos>. Adding in an <sos> token to both raises the size to 48 for both:

TransductionModel(
  (_encoder): SequenceEncoder(
    (_embedding): Embedding(48, 256)
    (_dropout): Dropout(p=0, inplace=False)
    (_unit): GRU(256, 256)
  )
  (_decoder): SequenceDecoder(
    (_embedding): Embedding(48, 256)
    (_dropout): Dropout(p=0, inplace=False)
    (_unit): GRU(256, 256)
    (_out): Linear(in_features=256, out_features=48, bias=True)
  )
)
jopetty commented 3 years ago

The other part was the transform token, which is included in one or the other.