lasigeBioTM / MER

Minimal Named-Entity Recognizer (MER)
http://labs.fc.ul.pt/mer/
56 stars 8 forks source link

Unicode Character 'EM SPACE' causes problems on annotations indexes #15

Closed LLCampos closed 7 years ago

LLCampos commented 7 years ago

When running:

bash get_entities.sh 1 T " water" ChEBI

We get:

1 T 3 8 0.378665 water unknown 1

The whitespace before "water" is not an usual white space.

We should get instead:

1 T 1 6 0.378665 water unknown 1

LLCampos commented 7 years ago

Replaced by better explained issue #18.