Closed HarshTrivedi closed 2 years ago
@dirkgr this is just a friendly ping to make sure you haven't forgotten about this issue 😜
@dirkgr this is just a friendly ping to make sure you haven't forgotten about this issue 😜
@dirkgr this is just a friendly ping to make sure you haven't forgotten about this issue 😜
I’m using T5 model implemented in allennlp, and need to add extra special tokens to its vocabulary.
I’ve added extra tokens in the tokenizer in my reader with
self._tokenizer.tokenizer.add_tokens(additional_tokens)
. But I also need to extend model’s token embeddings. Usually, when the transformer is loaded withPretrainedTransformerEmbedder
, it’s taken care of by itself because of this line. I can also do it by invoking HF’sresize_token_embeddings
manually, but, T5 object here doesn’t have this method on it. The T5 object here is allennlp's module, so doesn't have HF methods likeresize_token_embeddings
.My current solution is to manually extend T5 embeddings and lm_head, but it'd be good to have native support for this in allennlp.
@dirkgr Assigning you based on the discussion on slack.