The current implementation requires to specify the output dimension of the transformer model. This is due to pre-allocating the memory for the batch embeddings. This can be fixed by not pre-allocating the memory for the batch embeddings but use them directly. This solution does not need to specify the output dimension (redundant information prone to errors).
NOTE: PyTorch/TensorFlow allows for Tensor + Constant arithmetics, i.e. you can add a python primitive type constant (int or float) to a tensor and vice-versa, the result is always a Tensor.
The current implementation requires to specify the output dimension of the transformer model. This is due to pre-allocating the memory for the batch embeddings. This can be fixed by not pre-allocating the memory for the batch embeddings but use them directly. This solution does not need to specify the output dimension (redundant information prone to errors).
NOTE: PyTorch/TensorFlow allows for Tensor + Constant arithmetics, i.e. you can add a python primitive type constant (
int
orfloat
) to a tensor and vice-versa, the result is always a Tensor.