OpenBEL / bel.rb

Process BEL (Biological Expression Language) with ruby.
Apache License 2.0
8 stars 6 forks source link

Configure annotation/namespace definitions #92

Closed abargnesi closed 8 years ago

abargnesi commented 8 years ago

Background

Annotations and namespaces are catalogs for biological entities. They are generally modeled one-to-one with life science databases (Entrez Gene, HGNC, GO, NCBI Taxonomy, etc.). When you express knowledge in BEL you will want to use standard names to increase connectedness within your BKN (BEL Knowledge Network). Annotations and namespaces are defined in the header of BEL script (or XBEL) files. A document-local keyword is provided to refer to the namespace within BEL terms. The namespace values can be retrieved by downloading the provided URL. Here is an example definition:

DEFINE ANNOTATION Anatomy AS URL "http://host/anatomy.belanno"
DEFINE NAMESPACE EGID AS URL "http://host/entrez-gene-ids.belns"

Or for a full example, see the small corpus.

OpenBEL is moving to RDF representations for biological entities and BEL nanopubs. With RDF we are encouraged to use well-known URIs and for biological entities those are likely from identifiers.org. See #65 for more information on identifiers.org. Defining URLs for annotations and namespaces makes this transition difficult.

Proposal

In a discussion (notes) with @juliakozlovsky and @sanea we discussed an intermediate solution that will allow configuration of annotations and namespaces including both a URL, for OpenBEL framework compatibility, and RDF URI.

This is a general solution to #65.

We proposed the following:

abargnesi commented 8 years ago

We also need to map source annotation/namespace references (i.e. in the source document) to chosen ones in the output (e.g. identifiers.org). This will allow the translator to rewrite the references from old to new as it translates to RDF.

I'm dealing with this problem as well in #111. I should be able to use your file format directly if it contained this mapping of old to new annotation/namespace references.

Thoughts, @sanea, @juliakozlovsky?

abargnesi commented 8 years ago

Regarding the resource mapping file format I think we need the following:

I suggest using the YAML format since it's human readable/writable and supported in the Ruby standard library.

Example remapping OpenBEL namespaces to include identifiers.org RDF URIs:

namespaces:
  - remap:
      from:
        prefix:  "HGNC"
        url:     "http://resource.belframework.org/belframework/20150611/namespace/hgnc-human-genes.belns"
      to:
        prefix:  "HGNC"
        url:     "http://resource.belframework.org/belframework/20150611/namespace/hgnc-human-genes.belns"
        rdf_uri: "http://identifiers.org/hgnc/"
  - remap:
      from:
        prefix:  "EGID"
        url:     "http://resource.belframework.org/belframework/20150611/namespace/entrez-gene-ids.belns"
      to:
        prefix:  "EGID"
        url:     "http://resource.belframework.org/belframework/20150611/namespace/entrez-gene-ids.belns"
        rdf_uri: "  http://identifiers.org/ncbigene/"

Example remapping annotations:

annotations:
  - remap:
      from:
        keyword:  "Species"
        type:     "url"
        domain:   "http://resource.belframework.org/belframework/20150611/annotation/species-taxonomy-id.belanno"
      to:
        keyword:  "Species"
        type:     "url"
        domain:   "http://resource.belframework.org/belframework/20150611/annotation/species-taxonomy-id.belanno"
        rdf_uri:  "http://identifiers.org/taxonomy/"
  - remap:
      from:
        keyword:  "TextLocation"
        type:     "list"
        domain:
          - Value1
          - Value2
          - Value3
      to:
        keyword: "TextLocation"
        type:    "pattern"
        domain:  "Value[0-9]+"

Both annotations and namespaces can be combined in one file. e.g.

annotations:
  # Annotations to remap.
namespaces:
  # Namespaces to remap.

Thoughts, @sanea, @rumilbaybikov, @juliakozlovsky

abargnesi commented 8 years ago

Work completed in #118.