LHNCBC / metamaplite

A near real-time named-entity recognizer
https://metamap.nlm.nih.gov/MetaMapLite.shtml
Other
55 stars 14 forks source link

Improper concept index in MMI output #4

Open kaushikacharya opened 5 years ago

kaushikacharya commented 5 years ago

For the input file: 00000086.txt

In MetaMap, the outputs used to come like these:

'00000086-0'|MMI|17.80|Mediastinum|C0025066|[blor]|["MEDIASTINUM"-tx-2-"mediastinum"-noun-0]|TX|50/11|A01.923.761.800.500 '00000086-73'|MMI|8.34|Lung|C0024109|[bpoc]|["LUNGS"-tx-1-"lungs"-noun-0]|TX|1/5|A04.411

But the output in MetaMapLite comes like these: 00000000.tx|MMI|2.37|Mediastinum|C0025066|[blor]|"Mediastinum"-text-0-"mediastinum"-NN-0|63/11|A01.923.761.800.500 00000000.tx|MMI|0.98|Lung|C0024109|[bpoc]|"Lungs"-text-0-"lungs"-NNS-0|103/5|A04.411

The first field in MetaMap output represents filename and the char start pos of the sentence. But in MetaMapLite that information is missing. This makes programs using that information fail.

I had tested using the release: MetaMapLite 3.6.1p1