renaud / neuroNER

named entity recognizer for neuronal cells, based on UIMA Ruta rules
GNU Lesser General Public License v3.0
7 stars 8 forks source link

check that layers and capitalization is correct #22

Closed stripathy closed 9 years ago

stripathy commented 9 years ago

I think I saw that Layer 5A was not being found but Layer 5a was found. Same for Layer 5 B.

Similarly with Layer V A or Layer VIa, etc.

renaud commented 9 years ago

ok, will look into this. if i recall correctly, words shorter than 4 letter are case sensitive, while longer words are not (the number can be configured...)

renaud commented 9 years ago

a quick fix is to add the corresponding cases to the OBO file or better: turn hbp_layer_ontology.obo into a robo file and add the case variants as regular expressions

this obo term

[Term]
id: HBP_LAYER:0000001
name: layer 1
synonym: "layer1"  EXACT ALTERNATE_SPELLING []
synonym: "layer i" EXACT ALTERNATE_SPELLING []
synonym: "layer I" EXACT ALTERNATE_SPELLING []

would become like this with robo

[Term]
id: HBP_LAYER:0000001
name: layer 1
rsynonym: "layer ?[1Ii]" EXACT ALTERNATE_SPELLING [] 
renaud commented 9 years ago

so here's one proposal:

  1. copy hbp_layer_ontology.obo to hbp_layer_ontology.robo
  2. edit it to express layers with regular expression
  3. edit your neuroner.json file from Sherlok to use that new robo file (it should be as simple as adding an r character
  4. refresh Sherlok server accordingly

works for you?