inception-project / inception-external-recommender

Get annotation suggestions for the INCEpTION text annotation platform from spaCy, Sentence BERT, scikit-learn and more. Runs as a web-service compatible with the external recommender API of INCEpTION.
Apache License 2.0
40 stars 17 forks source link

Error with running the Adapters via INCEpTION #34

Open mobashgr opened 2 years ago

mobashgr commented 2 years ago

Hi, I have been trying to run the code for apdaters classifier. Here is the wsgi.py `from ariadne.server import Server from ariadne.util import setup_logging from ariadne.contrib.spacy import SpacyNerClassifier from ariadne.contrib.adapters import AdapterSequenceTagger setup_logging()

server = Server()

server.add_classifier( "adapter_pos", AdapterSequenceTagger( base_model_name="bert-base-uncased", adapter_name="pos/ldc2012t13@vblagoje", labels=[ "ADJ", "ADP", "ADV", "AUX", "CCONJ", "DET", "INTJ", "NOUN", "NUM", "PART", "PRON", "PROPN", "PUNCT", "SCONJ", "SYM", "VERB", "X", ], ), )

app = server._app

if name == "main": server.start(debug=True, port=40022)`

Here is the screenshot from the error

image

I have a follow-up question, what to do if I need to automatically load a pre-trained model without training? I tried to use the token classification pipeline for the documents on INCEpTION, but it is taking ages and it doesn't proceed.

INCEpTION_Error

Any help? Best, Ghadeer

jcklie commented 2 years ago

Hi, I honestly do not want to support adapters anymore because there are some issues with them for using in an external recommender, mainly that things like tagset are not easily discoverable. This looks like an issue with adapters again, as it crashes in adapter code. I will delete adapters from this repo most likely when I have time.

For your use case, you do not want adapters but normal transformers. You need to predict but use the tokenization of INCEpTION, not of the BERT tokenizer. Then you create spans like all the other contrib models. Some links for that

https://github.com/huggingface/transformers/issues/14305 https://huggingface.co/docs/transformers/custom_datasets?highlight=offset_mapping#token-classification-with-wnut-emerging-entities https://discuss.huggingface.co/t/predicting-with-token-classifier-on-data-with-no-gold-labels/9373

I would recommend you writing an external recommender from scratch, the repo should have enough util functions to help you. I sadly do not have the time right now to write the code for you and I do not use external recommenders from this repository much anymore.

jcklie commented 2 years ago

If using external recommender does not work for you, then you could also use preannotation similarly to what we describe in https://colab.research.google.com/github/inception-project/inception/blob/master/notebooks/using_pretokenized_and_preannotated_text.ipynb . Then, your annotators would need to delete wrrong annotations though. We also have no good way to change annotation boundaries, so that would also be done via delete and a new annotation.