CogStack / MedCATtrainer

A simple interface to inspect, improve and add concepts to biomedical NER+L -> MedCAT.
Other
71 stars 34 forks source link

Import concepts from concept DB does nothing #120

Open gitrach opened 1 year ago

gitrach commented 1 year ago

With image: cogstacksystems/medcat-trainer:v2.5.3 I cannot import the concepts from a concept database. As per the docs, I tick the box next to the relevant CDB, select Import concepts, and click Go, and the page simply refreshes. The concepts are not imported. On the home page, this action is not listed under My recent actions; however I can see the attempts in the Completed tasks.

Recent actions Completed tasks

Has anyone else come across this?

gitrach commented 1 year ago

Update: I have also rolled back to version 2.3.8 and the same issue arises. This is most noticeable if you are trying to use MedCATtrainer as an annotation tool (namely, a blank CDB in the project and the populated CDB is imported globally).

gitrach commented 1 year ago

I have tried again in version 2.3.7 and this time I identified an error to do with the medcat.config file: ttributeError: Can't get attribute '_DefPartial' on <module 'medcat.config' from '/usr/local/lib/python3.7/site-packages/medcat/config.py'>

Full error:

medcattrainer | [pid: 143|app: 0|req: 32/32] 172.18.0.4 () {54 vars in 1134 bytes} [Wed Feb 15 11:58:08 2023] GET /favicon.ico => generated 962 bytes in 1 msecs (HTTP/1.0 200) 3 headers in 109 bytes (1 switches on core 0) medcattrainer | INFO 2023-02-15 11:58:16,550 tasks.py l:257:Running api.admin.import_concepts_from_cdb medcattrainer | ERROR 2023-02-15 11:58:16,575 tasks.py l:57:Rescheduling api.admin.import_concepts_from_cdb medcattrainer | Traceback (most recent call last): medcattrainer | File "/usr/local/lib/python3.7/site-packages/background_task/tasks.py", line 43, in bg_runner medcattrainer | func(args, kwargs) medcattrainer | File "/home/api/api/admin.py", line 695, in import_concepts_from_cdb medcattrainer | cdb = CDB.load(cdb_model.cdb_file.path) medcattrainer | File "/usr/local/lib/python3.7/site-packages/medcat/cdb.py", line 411, in load medcattrainer | data = dill.load(f) medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 313, in load medcattrainer | return Unpickler(file, ignore=ignore, kwds).load() medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 525, in load medcattrainer | obj = StockUnpickler.load(self) medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 515, in find_class medcattrainer | return StockUnpickler.find_class(self, module, name) medcattrainer | AttributeError: Can't get attribute '_DefPartial' on <module 'medcat.config' from '/usr/local/lib/python3.7/site-packages/medcat/config.py'> medcattrainer | WARNING 2023-02-15 11:58:16,584 models.py l:260:Rescheduling task api.admin.import_concepts_from_cdb for 0:00:06 later at 2023-02-15 11:58:22.584139+00:00 medcattrainer | INFO 2023-02-15 11:58:26,622 tasks.py l:257:Running api.admin.import_concepts_from_cdb medcattrainer | ERROR 2023-02-15 11:58:26,623 tasks.py l:57:Rescheduling api.admin.import_concepts_from_cdb medcattrainer | Traceback (most recent call last): medcattrainer | File "/usr/local/lib/python3.7/site-packages/background_task/tasks.py", line 43, in bg_runner medcattrainer | func(args, kwargs) medcattrainer | File "/home/api/api/admin.py", line 695, in import_concepts_from_cdb medcattrainer | cdb = CDB.load(cdb_model.cdb_file.path) medcattrainer | File "/usr/local/lib/python3.7/site-packages/medcat/cdb.py", line 411, in load medcattrainer | data = dill.load(f) medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 313, in load medcattrainer | return Unpickler(file, ignore=ignore, kwds).load() medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 525, in load medcattrainer | obj = StockUnpickler.load(self) medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 515, in find_class medcattrainer | return StockUnpickler.find_class(self, module, name) medcattrainer | AttributeError: Can't get attribute '_DefPartial' on <module 'medcat.config' from '/usr/local/lib/python3.7/site-packages/medcat/config.py'> medcattrainer | WARNING 2023-02-15 11:58:26,628 models.py l:260:Rescheduling task api.admin.import_concepts_from_cdb for 0:00:21 later at 2023-02-15 11:58:47.628195+00:00 medcattrainer | INFO 2023-02-15 11:58:51,680 tasks.py l:257:Running api.admin.import_concepts_from_cdb medcattrainer | ERROR 2023-02-15 11:58:51,681 tasks.py l:57:Rescheduling api.admin.import_concepts_from_cdb medcattrainer | Traceback (most recent call last): medcattrainer | File "/usr/local/lib/python3.7/site-packages/background_task/tasks.py", line 43, in bg_runner medcattrainer | func(args, kwargs) medcattrainer | File "/home/api/api/admin.py", line 695, in import_concepts_from_cdb medcattrainer | cdb = CDB.load(cdb_model.cdb_file.path) medcattrainer | File "/usr/local/lib/python3.7/site-packages/medcat/cdb.py", line 411, in load medcattrainer | data = dill.load(f) medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 313, in load medcattrainer | return Unpickler(file, ignore=ignore, kwds).load() medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 525, in load medcattrainer | obj = StockUnpickler.load(self) medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 515, in find_class medcattrainer | return StockUnpickler.find_class(self, module, name) medcattrainer | AttributeError: Can't get attribute '_DefPartial' on <module 'medcat.config' from '/usr/local/lib/python3.7/site-packages/medcat/config.py'> medcattrainer | WARNING 2023-02-15 11:58:51,687 models.py l:260:Rescheduling task api.admin.import_concepts_from_cdb for 0:01:26 later at 2023-02-15 12:00:17.687335+00:00 medcattrainer | INFO 2023-02-15 12:00:21,975 tasks.py l:257:Running api.admin.import_concepts_from_cdb medcattrainer | ERROR 2023-02-15 12:00:21,976 tasks.py l:57:Rescheduling api.admin.import_concepts_from_cdb medcattrainer | Traceback (most recent call last): medcattrainer | File "/usr/local/lib/python3.7/site-packages/background_task/tasks.py", line 43, in bg_runner medcattrainer | func(args, kwargs) medcattrainer | File "/home/api/api/admin.py", line 695, in import_concepts_from_cdb medcattrainer | cdb = CDB.load(cdb_model.cdb_file.path) medcattrainer | File "/usr/local/lib/python3.7/site-packages/medcat/cdb.py", line 411, in load medcattrainer | data = dill.load(f) medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 313, in load medcattrainer | return Unpickler(file, ignore=ignore, kwds).load() medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 525, in load medcattrainer | obj = StockUnpickler.load(self) medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 515, in find_class medcattrainer | return StockUnpickler.find_class(self, module, name) medcattrainer | AttributeError: Can't get attribute '_DefPartial' on <module 'medcat.config' from '/usr/local/lib/python3.7/site-packages/medcat/config.py'> medcattrainer | WARNING 2023-02-15 12:00:21,982 models.py l:260:Rescheduling task api.admin.import_concepts_from_cdb for 0:04:21 later at 2023-02-15 12:04:42.982246+00:00

tomolopolis commented 1 year ago

Hey - I'd recommend you upgrade to image v2.5.5.

This version has multiple fixes and improvements related to the concept search and importing capability.

In summary - imorting concepts kicks off a background task, that is picked up by a polling event loop process that iterates through all concepts of the provided medcat model CDB indexing the names, synonyms and cuis in a solr collection.

Importing a cdb, removes and replaces the collection (if it already exists).

The main project homepage shows if the concept dB is correctly linked and imported, as shown by the green tick under the "concepts imported" column.

Hope that helps, Tom

On Mon, Feb 13, 2023, 15:45 Rachel @.***> wrote:

Update: I have also rolled back to version 2.3.8 and the same issue arises.

— Reply to this email directly, view it on GitHub https://github.com/CogStack/MedCATtrainer/issues/120#issuecomment-1428169664, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRTZNC6OIN3XTG2EZLUUFTWXJJKDANCNFSM6AAAAAAUVN4GZU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

tomolopolis commented 1 year ago

actually looking at this error looks like an issue with the serialisation of your model

medcattrainer | File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 515, in find_class
medcattrainer | return StockUnpickler.find_class(self, module, name)
medcattrainer | AttributeError: Can't get attribute '_DefPartial' on 

What version of MedCAT did you use to save this model?

LWserenic commented 2 months ago

Hello, sorry for opening an old topic but I have the same issue. I don't know what should happened because after pressing the import concepts it just refreshed the page and nothing happened but there is task appeared.