flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)
https://flairnlp.github.io/flair/
Other
13.93k stars 2.1k forks source link

[Question]: Is a config.json file available? #3244

Open katwegner opened 1 year ago

katwegner commented 1 year ago

Question

Dear developers,

I am a keen user of your FLAIR models and am looking for a configuration file to enable the model deployment on a cluster. I’m in awe of the FLAIR project, as it does an outstanding job in recognising name entities in the German language like no other available model. In order to fully use the model on our data, we would like to deploy the model to a cluster. Unfortunately, the cluster requires that we provide a config.json file that describes the model details. The open-source and free nature of the library suggests that it could be your goal that the user community and the versatility of use cases continually grow. Therefore, I would like to ask you whether you would like to make such a configuration file available to the public so that FLAIR models can also be used on the cluster. Specifically, I’m referring to the ner-german-large library. Thank you very much for considering this request and I’m looking forward to hearing back from you. Best, Katharina

helpmefindaname commented 1 year ago

Hello @katwegner,

I suppose with the config.json you are referring to the setup from a huggingface/transformers model? This is not possible for flair, as flair builds up on several kinds of embeddings - possibly combining multiple - and adding additional layers.

About the deployment: what are you trying to use for deployment? I can assure that making a docker-container running a simple rest service can be deployed on any k9s cluster or cloud service. It might be, that there are specific frameworks that make it easier to deploy certain models, but then it might make more sense to ask them to add support for flair models.

davidgxue commented 7 months ago

@helpmefindaname Are you sure config.json is not possible? When I run something like tagger = SequenceTagger.load("flair/ner-english-large")

I see this

pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.24G/2.24G [02:27<00:00, 15.2MB/s]
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25.0/25.0 [00:00<00:00, 367kB/s]
config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 616/616 [00:00<00:00, 13.1MB/s]
sentencepiece.bpe.model: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.07M/5.07M [00:00<00:00, 11.7MB/s]
tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.10M/9.10M [00:00<00:00, 14.2MB/s]

If config.json is not possible then where are these files coming from? Or is this for something else?

helpmefindaname commented 7 months ago

Hi @davidgxue

I think that

This is not possible for flair, as flair builds up on several kinds of embeddings - possibly combining multiple - and adding additional layers.

already answers your questions. Flair models can use those huggingface models as embeddings, but compose more than just that.

SichangHe commented 4 months ago

Uh oh. Was trying loading these high-performance Flair models into bumblebee. So, it seems impossible.

helpmefindaname commented 3 months ago

@SichangHe looking at the bulblebee docs, I don't see anything suggesting that they are supporting Flair models,

I suppose you could open a featurerequest on their side, but I am not sure, if they'd want to.