Closed Danysan1 closed 1 year ago
Hi @Danysan1, thanks for reporting this. Before I conduct a thorough check, may I ask if every class of your ontologies has at least one label available? I suspect this issue is caused by an empty list of class names.
Also, the default BERTMap configuration might not include necessary annotation properties available in your ontologies.
Yes, there were some classes without label. I have fixed them and the pipeline completed succesfully.
It would be ideal to check if this error is present and print an explicit error before calling the tokenizer.
Similarly, an unclear error is thrown in text_semantics.py line 232 if the passed ontology has no subClassOf relationships. It would be ideal to receive an explicit error.
Sure, I will update this in the next release. Thanks for your feedback.
Describe the bug Under some circumstances dureing the mapping extensions stage the tokenizer throws the error
IndexError: list index out of range
. The error originates at bert_classifier.py line 185. This is the same error and same location inside the tokenizer of https://github.com/huggingface/tokenizers/issues/993 , which was caused by the data passed to the tokenizer.To Reproduce I have reproduced this error with these settings:
max_length_for_input
batch_size_for_training
Expected behavior The stage and the pipeline should complete successfully
Platform: