clulab / reach

Reach Biomedical Information Extraction
Other
97 stars 39 forks source link

Globally label protein families #183

Closed MihaiSurdeanu closed 8 years ago

MihaiSurdeanu commented 8 years ago

If a protein mention is flipped from Protein to Family, make sure all instances THROUGHOUT the paper are labeled in the same way.

MihaiSurdeanu commented 8 years ago

This has to be done in Reach, after the entity engine runs for ALL sections of the paper.

hickst commented 8 years ago

I will update the overlap spreadsheet for the 3 recent KB changes, very soon and attach it here.

MihaiSurdeanu commented 8 years ago

This is based on a config file with 4 columns: Family|Protein|family|protein

That is, NewLabel|OldLabel|context for new label|context when to keep the old label

hickst commented 8 years ago

Picture of the latest KB overlap map; updated after the final replacement of homemade KBs:

overlap2

hickst commented 8 years ago

Sent to Hans and incorporated non-conflicting families into master override KB.