Open hansharhoff opened 2 years ago
My own update on this: I still am not clear on what XXXLMHead is, but using the hint from the exception I went looking for a good pretrained multilingual model. I have chosen xlm-roberta-base and will run TSDAE on top of this.
I ran into the same issue. My understanding is that TSDAE can use any compatible decoder since the decoder is only used during training time and not during inference. To get around this issue, I simply set my loss as follows:
losses.DenoisingAutoEncoderLoss(model, decoder_name_or_path="sentence-transformers/paraphrase-distilroberta-base-v2", tie_encoder_decoder=False)
I used paraphrase-distilroberta-base-v2
because it has the same hidden size as all-mpnet-base-v2
and tie_encoder_decoder
is false because both the architectures are different.
However, your original question of why all-mpnet-base-v2 does not work as a decoder is still a mystery to me.
Just a few thoughts about the comments:
all-mpnet "AND is trained on many languages." > this is not true, you should use xlm-* models for that as a base. all-mpnet is english base, for mpnet itself as well as the (semi) supervised constrastive training.
the huggingface implementation of all-mpnet has no decoder implemented, that's why it is failing.
tie_encoder_decoder=False and specifying a model with same dimensions should work. Even though this might be a bit risky, depending on how far away these two models are from each other regarding their training.
Anyway, you should add another supervised layer after your domain adaption / pre training with TSDAE for your source domain.
I am attempting to do unsupervised sentence embedding learning using TSDAE on a corpus of danish sentences. I have been running tests with the example code which uses bert-base-uncased but as I understand the model card, this has only been trained on English.
My intention was to retrain on all-mpnet-base-v2 as this is listed highest for sentence embedding AND is trained on many languages. I get the following error though:
I am not sure I understand why MPNet is not "allowed" as I do not understand what XXXLMHead is and how to determine if it is present for a given model. Further down, in what I expect to be the real issue, it lists the allowed model types.
It is unclear to me however how to figure out how these model types map to e.g. the list here:
https://www.sbert.net/docs/pretrained_models.html
and thus it is not clear to me which of the highly ranked pre-trained models are compatible with TSDAE (and secondarily multi-lingual ;) )
Any advice on how to find a suitable model candiate for TSDAE unsupervised learning with Danish support?