Rothamsted / knetbuilder

KnetBuilder data integration platform for building knowledge graphs. Previously known as ondex.
https://knetminer.com
MIT License
12 stars 11 forks source link

Remove GO: prefix from accessions #7

Closed KeywanHP closed 7 years ago

KeywanHP commented 7 years ago

GO accessions need to be modeled in a more consistent way. The new OWL ontology parser creates accessions without GO, TO prefix, I suggest to implement the same behavior in UniProt, GAF and other parsers that create GO annotations.

AjitPS commented 7 years ago

@KeywanHP Is this still a pending issue or has the owl parser been modified to match this?

marco-brandizi commented 7 years ago

I still haven't changed the owl parser. Going to add an option to decide which prefix to add to the parsed accessions.

marco-brandizi commented 7 years ago

I've added the mentioned feature. See examples at https://goo.gl/MoiVMf (look for property name = "accessionsMappers".

You can add a custom prefix to the accession mapper this way:

...
  <bean id = "idChangedPrefxAccMapper" class = "net.sourceforge.ondex.parser.owl.OWLAccessionsMapper">
    <property name="propertyIri" value="#{ns.iri ( 'oboInOwl:id' )}" />
    <property name="dataSourcePrototype" ref = "goDataSource" />  
    <property name="dataSourcePrefix" value = "GO:" />
    <!-- After having removed 'GO' and kept the numerical part of the accession, add 'GenOnt:' -->
    <property name="addedPrefix" value = "GenOnt_" />                   
  </bean>   
...

So, in this example GO:0000003 becomes GenOnt_0000003. If you don't set this property (or set it to ''), the final result will be a straight 0000003.

If you set addedPrefix to be the same of dataSourcePrefix (eg, GO: in both cases), such prefix is used to identify the source value of interest and then retained in the ONDEX accession (ie, GO:0000003 in go.owl is still that in the ONDEX accession).

AjitPS commented 7 years ago

@KeywanHP I have changed the Uniprot parser to omit the "GO:" prefix and can do the same for the GAF, GenericOBO parser. But then the GO url breaks (without the prefix, eg, http://www.ebi.ac.uk/QuickGO/GTerm?id=0022625 doesn't work but http://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0022625 does) so we'll need to change code for that in Ondex config, KnetMiner Gene View, Evidence View, KnetMaps, etc.

AjitPS commented 7 years ago

@marco-brandizi as you have the preifx option in the owl parser now, should I leave the other parsers (uniprot, GAF, GenericOBO) with the prefix (as they are) then?, to avoid making wholesale changes in Ondex and KnetMiner. @KeywanHP would be good to know what you think as well

marco-brandizi commented 7 years ago

@AjitPS my impression is it's quicker to use the option in the OWL parser, fewer changes to make. Technically both possibilities are available now.

AjitPS commented 7 years ago

Thanks @marco-brandizi , makes sense not to change other parsers then. @Monika-Mistry

marco-brandizi commented 7 years ago

@AjitPS, can we close this ticket? Please feel free to do it in case.

AjitPS commented 7 years ago

yes, it's fixed now