Closed farrell236 closed 4 years ago
I tried this example and got the following output:
1.0,0.0,,,0.0,,,,,0.0,0.0,,,,
My guess is that the negation rules are not matching, so there may be an issue with the versions of your packages. I'd make sure the versions match those specified in https://github.com/stanfordmlgroup/chexpert-labeler/blob/master/environment.yml .
If you need to dive deeper, the negation rules are matched here:
https://github.com/stanfordmlgroup/chexpert-labeler/blob/master/stages/classify.py#L53
Many thanks for the fast response @jirvin16! that indeed have fixed the issue.
Hi Jeremy, I have encountered another few oddities. The following reports produces no annotations at all, not even "No Finding":
$ cat sample.csv
"Both lungs remain clear and expanded. Heart and pulmonary XXXX are normal. No change in the large hiatus hernia."
"Hyperlucent hyperinflated lungs with flattened diaphragms. Granulomas. Small sized heart. Minimal apical capping slightly greater at the left. XXXX unremarkable."
"Normal heart. Clear lungs. Trachea midline. Scoliosis of lower thoracic spine. Degenerative changes of thoracic spine."
$ python label.py --reports_path sample.csv --output_path sample_labelled.csv
$ cat sample_labelled.csv
Reports,No Finding,Enlarged Cardiomediastinum,Cardiomegaly,Lung Lesion,Lung Opacity,Edema,Consolidation,Pneumonia,Atelectasis,Pneumothorax,Pleural Effusion,Pleural Other,Fracture,Support Devices
Both lungs remain clear and expanded. Heart and pulmonary XXXX are normal. No change in the large hiatus hernia.,,,,,,,,,,,,,,
Hyperlucent hyperinflated lungs with flattened diaphragms. Granulomas. Small sized heart. Minimal apical capping slightly greater at the left. XXXX unremarkable.,,,,,,,,,,,,,,
Normal heart. Clear lungs. Trachea midline. Scoliosis of lower thoracic spine. Degenerative changes of thoracic spine.,,,,,,,,,,,,,,
are these some corner cases?
@farrell236 Which package did you have to change for it to work? I've got exactly the same problem with it only giving 1's and no uncertainties or negatives. Is it a negbio thing?
^Have you been able to resolve this @farrell236? If not, I can take a look.
It was jpype that wasn't installed. All good now, thanks for the help!
@jirvin16, regarding issue in the reopened comment, I wasn't able to solve it. As it was only 3 samples out of ~1K, I just decided to omit it and assume they were corner cases. The initial issue was resolved by fixing package versions from the environment.yml.
Sorry, I actually think the output of the labeler is expected on those cases. No Finding is intended to capture the absence of all findings (except support devices), not just the ones in the 12 categories. See https://github.com/stanfordmlgroup/chexpert-labeler/blob/master/phrases/mention/no_finding.txt for a list of the findings it looks for.
Thanks for clarifying this @jirvin16, I guess this issue can be closed.
Hello! I could not quite understand this document https://github.com/stanfordmlgroup/chexpert-labeler/blob/master/phrases/mention/no_finding.txt What is the information it has? are there more than 12 categories? Thank you in advance!
Hello! I could not quite understand this document https://github.com/stanfordmlgroup/chexpert-labeler/blob/master/phrases/mention/no_finding.txt What is the information it has? are there more than 12 categories? Thank you in advance!
These are the phrases that the labeler searches for when determining "No Finding." If any of the main 12 observations or any the phrases in the no_finding.txt
list are found (without being negated), "No Finding" is 0. Otherwise, "No Finding" is 1. So this category was intended to capture the absence of any finding, rather than just the absence of the 12 observations.
Thanks for open sourcing the CXR labelling tool! I tried using the labelling tool on other CXR clinical reports and have got a very bizarre result:
From here:
https://openi.nlm.nih.gov/detailedresult?img=CXR2016_IM-0665-1001
The text to be parsed is:
"The lungs are clear without evidence of focal airspace disease. There is no evidence of pneumothorax or large pleural effusion. The cardiac and mediastinal contours are within normal limits."
The output of the NLP labler is:
The lungs are clear without evidence of focal airspace disease. There is no evidence of pneumothorax or large pleural effusion. The cardiac and mediastinal contours are within normal limits.,,1.0,,,1.0,,,,,1.0,1.0,,,
which has marked positive for the following; Enlarged Cardiomediastinum, Lung Opacity, Pneumothorax, Pleural Effusion.
I'm not from an NLP background, so I'm not very sure what is causing these classes to be positively flagged when it should be negative?
Thanks!