UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT
https://www.SBERT.net
Apache License 2.0
14.53k stars 2.41k forks source link

Can I train "gpt2" model using SBERT code ? #481

Open umairspn opened 3 years ago

umairspn commented 3 years ago

Hi, I am trying to use "gpt2" hugginface model with the SBERT code, where it gives NoneType error in the get_sentence_features function. word_embedding_model = models.Transformer('gpt2') I read about the GPT2 architecture, which is a bit different from BERT. i.e. GPT-2 uses transformer decoder blocks, whereas BERT uses transformer encoder blocks. Also, GPT-2 only outputs one token at a time, whereas BERT outputs a vector of size 512. So, I just wanted to make sure if it's possible to use "GPT2" with the provided SBERT code. Any help will be appreciated. Thank You.

umairspn commented 3 years ago

To be exact, the error comes in the line: return self.tokenizer.prepare_for_model(tokens, max_length=pad_seq_length, pad_to_max_length=True, return_tensors='pt') RuntimeError: Could not infer dtype of NoneType @nreimers

thak123 commented 3 years ago

@umairspn did u fix the error ?

umairspn commented 3 years ago

@thak123 No the error still occurs when I use GPT model as baseline. For other baselines, it works fine