chaoyi-wu / RadFM

The official code for "Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data".
331 stars 33 forks source link

Can't find factory for 'abbreviation_detector' for language English (en) #32

Open qiansehu opened 4 months ago

qiansehu commented 4 months ago

Hello, while I try to run the train.py in src, i got this error :

/root/miniconda3/envs/radfm/lib/python3.9/site-packages/spacy/language.py:2195: FutureWarning: Possible set union at position 6328 deserializers["tokenizer"] = lambda p: self.tokenizer.from_disk( # type: ignore[union-attr] Traceback (most recent call last): File "/home/test/RadFM/src/train.py", line 129, in main() File "/home/test/RadFM/src/train.py", line 109, in main Train_dataset = multi_dataset(text_tokenizer = model_args.tokenizer_path) File "/home/test/RadFM/src/Dataset/multi_dataset.py", line 88, in init self.words_extract = umls_extractor() File "/home/test/RadFM/src/Dataset/multi_dataset.py", line 25, in init nlp.add_pipe("abbreviation_detector") File "/root/miniconda3/envs/radfm/lib/python3.9/site-packages/spacy/language.py", line 821, in add_pipe pipe_component = self.create_pipe( File "/root/miniconda3/envs/radfm/lib/python3.9/site-packages/spacy/language.py", line 690, in create_pipe raise ValueError(err) ValueError: [E002] Can't find factory for 'abbreviation_detector' for language English (en). This usually happens when spaCy calls nlp.create_pipe with a custom component name that's not registered on the current language class. If you're using a custom component, make sure you've added the decorator @Language.component (for function components) or @Language.factory (for class components).

Available factories: attribute_ruler, tok2vec, merge_noun_chunks, merge_entities, merge_subtokens, token_splitter, doc_cleaner, parser, beam_parser, lemmatizer, trainable_lemmatizer, entity_linker, entity_ruler, tagger, morphologizer, ner, beam_ner, senter, sentencizer, spancat, spancat_singlelabel, span_finder, future_entity_ruler, span_ruler, textcat, textcat_multilabel, en.lemmatizer

i had search some issue said build a new env can solve this , but i tried it's not work

image

RobsoUganda commented 4 months ago

Installing things in a new environment solved the issue. As was mentioned here works https://github.com/allenai/scispacy/issues/347