renaud / neuroNER

named entity recognizer for neuronal cells, based on UIMA Ruta rules
GNU Lesser General Public License v3.0
7 stars 8 forks source link

switch ontology resource backend to .ttl from .obo #56

Open stripathy opened 8 years ago

stripathy commented 8 years ago

@renaud mentioned that he would discuss this with Catherine following the recommendation from @tgbugs

Since @renaud's put together a really nice infrastructure for working with .obo files for NER, my vote is that that we keep using it for our backend and use translation files as needed.

tgbugs commented 8 years ago

Having read a bit more on what @renaud has built using obo as a base I think this is probably the best idea for a couple of reasons. 1) Not all the synonyms that are needed for text mining are appropriate for ontologies, over time when you find good synonyms in the literature they can be migrated. 2) Much of what these ontology files are being used for is building synonym lists and if you are parsing them yourself then who cares. The place you probably do care is in mapping to identifiers, and it is essentially trivial to parse a ttl file to get the identifier (curie), label, and synonyms for a term, write them out to obo, and then add to them in the obo. 3) Don't redo the pipeline, just migrate identifiers where needs be.

On the other hand, if we go forward using ttl we can add an explicity 'hasTextminingSynonym' annotation property or something like that which can enable an easier transition to other synonym types if the data from the literature suggests it. (I will probably do this either way and pull the text mining synonyms in from the obo if needs be.)