Open mobashgr opened 2 years ago
Hi, I honestly do not want to support adapters anymore because there are some issues with them for using in an external recommender, mainly that things like tagset are not easily discoverable. This looks like an issue with adapters again, as it crashes in adapter code. I will delete adapters from this repo most likely when I have time.
For your use case, you do not want adapters but normal transformers. You need to predict but use the tokenization of INCEpTION, not of the BERT tokenizer. Then you create spans like all the other contrib models. Some links for that
https://github.com/huggingface/transformers/issues/14305 https://huggingface.co/docs/transformers/custom_datasets?highlight=offset_mapping#token-classification-with-wnut-emerging-entities https://discuss.huggingface.co/t/predicting-with-token-classifier-on-data-with-no-gold-labels/9373
I would recommend you writing an external recommender from scratch, the repo should have enough util functions to help you. I sadly do not have the time right now to write the code for you and I do not use external recommenders from this repository much anymore.
If using external recommender does not work for you, then you could also use preannotation similarly to what we describe in https://colab.research.google.com/github/inception-project/inception/blob/master/notebooks/using_pretokenized_and_preannotated_text.ipynb . Then, your annotators would need to delete wrrong annotations though. We also have no good way to change annotation boundaries, so that would also be done via delete and a new annotation.
Hi, I have been trying to run the code for apdaters classifier. Here is the wsgi.py `from ariadne.server import Server from ariadne.util import setup_logging from ariadne.contrib.spacy import SpacyNerClassifier from ariadne.contrib.adapters import AdapterSequenceTagger setup_logging()
server = Server()
server.add_classifier( "adapter_pos", AdapterSequenceTagger( base_model_name="bert-base-uncased", adapter_name="pos/ldc2012t13@vblagoje", labels=[ "ADJ", "ADP", "ADV", "AUX", "CCONJ", "DET", "INTJ", "NOUN", "NUM", "PART", "PRON", "PROPN", "PUNCT", "SCONJ", "SYM", "VERB", "X", ], ), )
app = server._app
if name == "main": server.start(debug=True, port=40022)`
Here is the screenshot from the error
I have a follow-up question, what to do if I need to automatically load a pre-trained model without training? I tried to use the token classification pipeline for the documents on INCEpTION, but it is taking ages and it doesn't proceed.
Any help? Best, Ghadeer