Open umiron opened 2 months ago
Hey @umiron I believe there isn't anything within mergekit that is a barrier to inter- xlm-roberta related merges as the architecture format is tensor size oblivious.
If this really matches up with the xlm-roberta weight names and architecture, add the architecture name (XLMRobertaForMaskedLM
) here locally and test to see if it works
Is it possible to add support for xlm-roberta? It's the same architecture as roberta, except for a larger vocabulary since it is multi-lingual.