Context-based medical concepts extraction from the given text in python spacy

Raghu17s commented 4 years ago

Hi ,

Thanks for this awesome negation tool, its amazing.

Motivating by negspacy I have some thoughts to implement which is similar task of negspacy, so kindly suggest to me.

I have a medical text which has a diagnosis, past history, negations (like no headache, fever), and in what case-patient to consult a doctor in the future/emergency.

Example of text

The patient admitted to the hospital with hypertension and chronic kidney disease. The patient had a past history of diabetes mellitus and coronary artery disease. When the patient admitted to the hospital, no symptoms of fever, giddiness, and headache found. The patient is asked to consult a doctor in case of vomiting and nausea.

The above sentence has a present illness (sentence 1), past history (sentence 2), negations (sentence 3), and future consultation (sentence 4). I have been using scispacy for medical concept extraction and negspacy for negations, both of them are working fine.

Now my next task is, How do I separate present illness, past history, and future consultations in the NLP technique?

I have thought in mind that to add "past history of", "in case of emergency", "history of" in the chunk_prefix. is it a good move?

Can I create a duplicate of negspacy and add my own terms and add as a separate pipeline to spacy?

jenojp commented 4 years ago

That's a good question... if you didn't care whether they were "typical negations" or history and just wanted to negate them all you could just use your own term sets when initializing negspacy like here: https://github.com/jenojp/negspacy#use-own-patterns-or-view-patterns-in-use

If you wanted to distinguish between the two, you could add a 2nd negspacy pipeline function. The only issue I can see is that the package is built to register a spacy extension with the name "negex". If you look at lines 52-53 here https://github.com/jenojp/negspacy/blob/master/negspacy/negation.py

if not Span.has_extension("negex"):
            Span.set_extension("negex", default=False, force=True)

I don't see a reason why we couldn't make that name a variable with the default of "negex" in the init starting at line 34.

Let me know if you'd like to give that edit a try and see if it works for your use case. If so, it'd be a great new feature / pull request.

PS I should note that the different term sets are located in this file https://github.com/jenojp/negspacy/blob/master/negspacy/termsets.py

Raghu17s commented 4 years ago

Hi @jenojp

It works for "family history" and "past history" when language is chosen as "en_clinical_sensitive" as they are already present.

As you suggested, I have given termination = ['adviced to'] but it didnt work,

Here is my code.

nlp = spacy.load("en_core_sci_sm")
negex = Negex(nlp, language = "en_clinical_sensitive", chunk_prefix = ["no"], termination = ['adviced to'])
nlp.add_pipe(negex)
doc = nlp('''The patient admitted to  hospital with hypertension, chronic kidney disease.  
          The patient had a past history of diabetes mellitus, coronary artery disease. He had a family history of HIV, .
          When the patient admitted to the hospital, no symptoms of fever, giddiness,  headache found. 
          The patient advised to come back to the hospital in case of vomiting, headache. Further, no signs of cataract left eye''')
for e in doc.ents:
    print(e.text, '---', e._.negex)

The results is

patient --- False
hypertension --- False
chronic kidney disease --- False
patient --- False
history --- False
diabetes mellitus --- True
coronary artery disease --- True
family history --- False
HIV --- True
patient --- False
hospital --- False
symptoms --- True
fever --- True
giddiness --- True
headache --- True
patient --- False
hospital --- False
case --- False
vomiting --- False
nausea --- False
no signs --- True
cataract left eye --- True

I expected vomiting and nausea to be True here.

Could you pls help me on same.

jenojp commented 4 years ago

I think I see 2 issues here. 1) you want to update the 'preceding_negations' term set not the 'termination'. See this for definitions: https://github.com/jenojp/negspacy#negex-patterns 2) I think you meant to do "advised to" instead of "adviced to".

negex = Negex(nlp, language = "en_clinical_sensitive", chunk_prefix = ["no"], preceding_negations = ['advised to'])

Keep in mind, though, if you do preceding_negations = ['advised to'] you'll overwrite the preceding_negations. So you'll want to expand that list to include everything else you want from here: https://github.com/jenojp/negspacy/blob/master/negspacy/termsets.py

Raghu17s commented 4 years ago

Yes, You are correct @jenojp, new patterns will overwrite preceding_negations. Is there any way to append new patterns to existing patterns without overwriting (through code.) Currently, I have taken a copy of preceding_negations and appened new patterns (may not be a wise move).

This is turning out to be an interesting pipeline in spacy.

jenojp commented 4 years ago

That's a good new feature that I've wanted to implement. Basically an add terms vs replace the full set of terms

Raghu17s commented 4 years ago

Hi @jenojp

In continuation to the same question, could you please take this too? My text example was The patient admitted to the hospital with hypertension and chronic kidney disease. The patient had a past history of diabetes mellitus and coronary artery disease. He had a family history of HIV. When the patient admitted to the hospital, no symptoms of fever, giddiness, and headache found.

Now is it possible to print which pattern was used to negate the entity? output something like this

**Entity**  **Status** **Pattern**
patient --- False ---
hypertension --- False ---
chronic kidney disease --- False ---
patient --- False ---
history --- False ---
diabetes mellitus --- True --- history of
coronary artery disease --- True --- history of
family history --- False---
HIV --- True--- history of
patient --- False---
hospital --- False---
symptoms --- True--- No
fever --- True--- No
giddiness --- True--- No
headache --- True--- No

Note: I have not written anything for False as it's not negation.

jenojp commented 4 years ago

Apologies for not responding sooner! Right now it's set up to see if any negations are present for a particular entity and once any are found, it stops evaluating and registers the spacy extension negex = True. So I think to do this, two things would have to be added: 1) register another spacy extension with negation details or something like that, 2) optionally allow all relevant negations to be saved.

I'm a touch hesitant as this would likely decrease performance so it would definitely have to be an optional feature.

Raghu17s commented 4 years ago

Hey, I didn't get your method.

Can you please explain in detail? Or else kind of pseudo code.

jenojp / negspacy

Context-based medical concepts extraction from the given text in python spacy #17