NLPatVCU / medaCy

:hospital: Medical Text Mining and Information Extraction with spaCy
GNU General Public License v3.0
432 stars 91 forks source link

Request for information about models and annotation formats #203

Open balachander1964 opened 3 years ago

balachander1964 commented 3 years ago

Hi, I will appreciate if you share the links to download the models and data annotation format details.

swfarnsworth commented 3 years ago

MedaCy reads files in the BRAT format.

There's only one model online currently, namely clinical notes. However we do have other datasets available with different entity types. What types of entities did you want to be able to identify?

balachander1964 commented 3 years ago

Hi Steele,

Thank you for responding to my query. First Wishing you and your team a Very Happy and Prosperous New Year. We are trying to extract different types of entities and extract the relationship among them. Those entities include: Drug, dosage, duration of therapy, Frequency, and mode of administration; Diseases, signs and symptoms, conditions, etc Severity and grading information Affected part of the body Date and time information when the patient saw this condition, and when he / she was treated

If the model helps us to extract at least one or two from the list it will be great. I look forward to your reply. Thank you.

Bala

Sent from Mail for Windows 10

From: Steele Farnsworth Sent: Friday, December 25, 2020 11:32 PM To: NLPatVCU/medaCy Cc: balachander1964; Author Subject: Re: [NLPatVCU/medaCy] Request for information about models andannotation formats (#203)

MedaCy reads files in the BRAT format. There's only one model online currently, namely clinical notes. However we do have other datasets available with different entity types. What types of entities did you want to be able to identify? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

swfarnsworth commented 3 years ago

Bala,

Thank you for the well wishes. I wish you and those close to you the same.

We have a dataset with most of the entity types you specified, however I am not sure that we have one that can identify severity and affected areas. A model trained upon it is given here, though the API for it was designed before we transitioned medaCy to a primarily command line application. This file is the actual model, and if you download it, you should be able to use it with medaCy's command line interface's predict functionality, using ClinicalPipeline as the pipeline option.

This particular model uses a conditional random field. We have had more success with BiLSTM and BERT models, though these have better performance when a GPU is available.

Please let us know if we can be of further assistance.

Steele

balachander1964 commented 3 years ago

Thank you Steele.

Sent from Mail for Windows 10

From: Steele Farnsworth Sent: Monday, December 28, 2020 10:43 AM To: NLPatVCU/medaCy Cc: balachander1964; Author Subject: Re: [NLPatVCU/medaCy] Request for information about models andannotation formats (#203)

Bala, Thank you for the well wishes. I wish you and those close to you the same. We have a dataset with most of the entity types you specified, however I am not sure that we have one that can identify severity and affected areas. A model trained upon it is given here, though the API for it was designed before we transitioned medaCy to a primarily command line application. This file is the actual model, and if you download it, you should be able to use it with medaCy's command line interface's predict functionality, using ClinicalPipeline as the pipeline option. This particular model uses a conditional random field. We have had more success with BiLSTM and BERT models, though these have better performance when a GPU is available. Please let us know if we can be of further assistance. Steele — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.