First off, thanks for sharing the code. In the paper, it says that MAUVE works with other embedding models. Therefore, I wanted to try out models such as DialoGPT from Microsoft. But in the code, it limits the model and tokenizer name to "gpt2" family. I think it would better we remove this restriction since others might also want to try out other models.
Hello!
First off, thanks for sharing the code. In the paper, it says that MAUVE works with other embedding models. Therefore, I wanted to try out models such as DialoGPT from Microsoft. But in the code, it limits the model and tokenizer name to "gpt2" family. I think it would better we remove this restriction since others might also want to try out other models.
If you want, I can make a PR to change this.
https://github.com/krishnap25/mauve/blob/b3c01d5b0f3be85a997b1171b3f2efa3ba16280b/src/mauve/utils.py#L25-L39