Closed abargnesi closed 8 years ago
We also need to map source annotation/namespace references (i.e. in the source document) to chosen ones in the output (e.g. identifiers.org). This will allow the translator to rewrite the references from old to new as it translates to RDF.
I'm dealing with this problem as well in #111. I should be able to use your file format directly if it contained this mapping of old to new annotation/namespace references.
Thoughts, @sanea, @juliakozlovsky?
Regarding the resource mapping file format I think we need the following:
I suggest using the YAML format since it's human readable/writable and supported in the Ruby standard library.
Example remapping OpenBEL namespaces to include identifiers.org RDF URIs:
namespaces:
- remap:
from:
prefix: "HGNC"
url: "http://resource.belframework.org/belframework/20150611/namespace/hgnc-human-genes.belns"
to:
prefix: "HGNC"
url: "http://resource.belframework.org/belframework/20150611/namespace/hgnc-human-genes.belns"
rdf_uri: "http://identifiers.org/hgnc/"
- remap:
from:
prefix: "EGID"
url: "http://resource.belframework.org/belframework/20150611/namespace/entrez-gene-ids.belns"
to:
prefix: "EGID"
url: "http://resource.belframework.org/belframework/20150611/namespace/entrez-gene-ids.belns"
rdf_uri: " http://identifiers.org/ncbigene/"
Example remapping annotations:
annotations:
- remap:
from:
keyword: "Species"
type: "url"
domain: "http://resource.belframework.org/belframework/20150611/annotation/species-taxonomy-id.belanno"
to:
keyword: "Species"
type: "url"
domain: "http://resource.belframework.org/belframework/20150611/annotation/species-taxonomy-id.belanno"
rdf_uri: "http://identifiers.org/taxonomy/"
- remap:
from:
keyword: "TextLocation"
type: "list"
domain:
- Value1
- Value2
- Value3
to:
keyword: "TextLocation"
type: "pattern"
domain: "Value[0-9]+"
Both annotations
and namespaces
can be combined in one file. e.g.
annotations:
# Annotations to remap.
namespaces:
# Namespaces to remap.
Thoughts, @sanea, @rumilbaybikov, @juliakozlovsky
Work completed in #118.
Background
Annotations and namespaces are catalogs for biological entities. They are generally modeled one-to-one with life science databases (Entrez Gene, HGNC, GO, NCBI Taxonomy, etc.). When you express knowledge in BEL you will want to use standard names to increase connectedness within your BKN (BEL Knowledge Network). Annotations and namespaces are defined in the header of BEL script (or XBEL) files. A document-local keyword is provided to refer to the namespace within BEL terms. The namespace values can be retrieved by downloading the provided URL. Here is an example definition:
Or for a full example, see the small corpus.
OpenBEL is moving to RDF representations for biological entities and BEL nanopubs. With RDF we are encouraged to use well-known URIs and for biological entities those are likely from identifiers.org. See #65 for more information on identifiers.org. Defining URLs for annotations and namespaces makes this transition difficult.
Proposal
In a discussion (notes) with @juliakozlovsky and @sanea we discussed an intermediate solution that will allow configuration of annotations and namespaces including both a URL, for OpenBEL framework compatibility, and RDF URI.
This is a general solution to #65.
We proposed the following:
http://identifiers.org/hgnc/
for HGNC).bin/bel2rdf.rb
command that expects to receive a file with this format. I propose-r
and--resource-override
for the name.