renaud / neuroNER

named entity recognizer for neuronal cells, based on UIMA Ruta rules
GNU Lesser General Public License v3.0
7 stars 8 forks source link

Difficult example of listed elements with trigger words which come at end of list #17

Open stripathy opened 9 years ago

stripathy commented 9 years ago

Example where someone uses [protein], [protein], [protein] [protein trigger] [neuron trigger]

screenshot from 2015-06-25 15 56 09

The full sentence: Immunoreactivity was present in calretinin-, neuronal nitric oxide synthase-, and reelin-expressing cells, as well as in subsets of cholecystokinin- or calbindin-expressing or radiatum-retrohippocampally projecting GABAergic cells, but not in parvalbumin- and/or somatostatin-expressing interneurons.

stripathy commented 9 years ago

We discussed this earlier - the only way to do this would be with sentence level parsing...

stripathy commented 9 years ago

related to #35

renaud commented 9 years ago

It mostly works, except the individual proteins:

{
  "ProteinProp": [
    {
      "begin": 5,
      "end": 49,
      "properties": {
        "ontologyId": "NCBI_GENE:12308"
      }
    },
    {
      "begin": 18,
      "end": 72,
      "properties": {
        "ontologyId": "NCBI_GENE:18125"
      }
    },
    {
      "begin": 18,
      "end": 72,
      "properties": {
        "ontologyId": "NCBI_GENE:18126"
      }
    },
    {
      "begin": 55,
      "end": 72,
      "properties": {
        "ontologyId": "NCBI_GENE:19699"
      }
    }
  ],
  "ProteinTrigger": [
    {
      "begin": 62,
      "end": 72
    }
  ],
  "Function": [
    {
      "begin": 18,
      "end": 26
    }
  ],
  "Neuron": [
    {
      "begin": 5,
      "end": 78
    }
  ]
}