microsoft / TransformerCompression

For releasing code related to compression methods for transformers, accompanying our publications
MIT License
356 stars 31 forks source link

Update wikitext2 dataloader tokenizer call #94

Closed pashminacameron closed 7 months ago

pashminacameron commented 7 months ago

Throws a warning: Token indices sequence length is longer than the specified maximum sequence length.

pashminacameron commented 7 months ago

Padding in this fashion gives a different result. For now, the warning is cosmetic and gives the right result, whereas adding this changes the result. Abandoning.