I'd like to feed short samples of various length to the model, and I would like to put them in a single batch.
Hence my question is, does the model support padding?
no, you'd have to modify the code to accept a binary mask and use that to mask out the tokens you want to pad out (by changing how positional embeddings and autoregressive mask are computed)
Hi,
I'd like to feed short samples of various length to the model, and I would like to put them in a single batch.
Hence my question is, does the model support padding?
If yes what is the token?