Which language model is using for minilm

davidberenstein1957 / crosslingual-coreference

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

MIT License

102 stars 17 forks source link

Which language model is using for minilm #16

Closed pradeepdev-1995 closed 1 year ago

pradeepdev-1995 commented 1 year ago

I am using the following code snippet for coreference resolution

predictor = Predictor(language="en_core_web_sm", device=-1, model_name="minilm")

While checking the below source code,

"minilm": {
        "url": (
            "https://storage.googleapis.com/pandora-intelligence/models/crosslingual-coreference/minilm/model.tar.gz"
        ),
        "f1_score_ontonotes": 74,
        "file_extension": ".tar.gz",
    },

it seems that the language model using here is https://storage.googleapis.com/pandora-intelligence/models/crosslingual-coreference/minilm/model.tar.gz

Is this the same one that I can see in https://huggingface.co/models like https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384/tree/main or any other huggingface model?

davidberenstein1957 commented 1 year ago

it is this model https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384

pradeepdev-1995 commented 1 year ago

Thanks, @davidberenstein1957 Seems it is a multilingual model. So shall I use only the English version(vocabulary and other settings) of this to reduce the language model size?

davidberenstein1957 commented 1 year ago

Yes, however, the cross-lingual minilm is smaller and quicker than the xlm-roberta or spanbert alternatives. However, it is less accurate for english.

pradeepdev-1995 commented 1 year ago

Okay, Thanks for the info. For now, I am focussing to reduce the model size/ increase the inference time rather than accuracy. So how should I modify the code predictor = Predictor(language="en_core_web_sm", device=-1, model_name="minilm") to use only the English vocabulary to reduce the model size?

davidberenstein1957 commented 1 year ago

yes than the model minilmshould be the best in your case.

pradeepdev-1995 commented 1 year ago

Yes, So how should I modify the code predictor = Predictor(language="en_core_web_sm", device=-1, model_name="minilm") or even the model(https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384) itself to use only the English vocabulary for reducing the model size?

davidberenstein1957 commented 1 year ago

predictor = Predictor(language="en_core_web_sm", device=-1, model_name="minilm") is correct. Then it uses the smallest model available. Or you could fine-tune your own model but that introduces some overhead and effort.