stanfordmlgroup / chexpert-labeler

CheXpert NLP tool to extract observations from radiology reports.
MIT License
328 stars 78 forks source link

Inexplicable negation mistake #45

Open CPBridge opened 1 year ago

CPBridge commented 1 year ago

Thanks for your work on this and making it available! However I have found a really bizarre edge case that I'd love to understand...

If the input CSV contains just this:

No rib fracture

Then the output CSV looks like this (as I would expect).

Reports,No Finding,Enlarged Cardiomediastinum,Cardiomegaly,Lung Lesion,Lung Opacity,Edema,Consolidation,Pneumonia,Atelectasis,Pneumothorax,Pleural Effusion,Pleural Other,Fracture,Support Devices
No rib fracture,1.0,,,,,,,,,,,,0.0,

If however I add a full stop at the end (and make no other change), the output switches to be positive for fracture!

No rib fracture.

with the output:

Reports,No Finding,Enlarged Cardiomediastinum,Cardiomegaly,Lung Lesion,Lung Opacity,Edema,Consolidation,Pneumonia,Atelectasis,Pneumothorax,Pleural Effusion,Pleural Other,Fracture,Support Devices
No rib fracture.,,,,,,,,,,,,,1.0,

Any insight would be much appreciated!