SynBioDex / sboljs3

A library for the Synthetic Biology Open Language (SBOL) written in TypeScript, for JavaScript/TypeScript applications in the browser or node.js
4 stars 1 forks source link

SBOl 2->3 conversion errors #14

Open jakebeal opened 2 years ago

jakebeal commented 2 years ago

Conversion from SBOL2 to SBOL3 creates files with validation errors. For example, here are the errors from conversion of this short J23101 XML file:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:sbol="http://sbols.org/v2#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#dateTime/" xmlns:om="http://www.ontology-of-units-of-measure.org/resource/om-2/" xmlns:synbiohub="http://synbiohub.org#" xmlns:sbh="http://wiki.synbiohub.org/wiki/Terms/synbiohub#" xmlns:sybio="http://www.sybio.ncl.ac.uk#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:ncbi="http://www.ncbi.nlm.nih.gov#" xmlns:igem="http://wiki.synbiohub.org/wiki/Terms/igem#" xmlns:genbank="http://www.ncbi.nlm.nih.gov/genbank#" xmlns:gbconv="http://sbols.org/genBankConversion#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:obo="http://purl.obolibrary.org/obo/">
  <sbol:ComponentDefinition rdf:about="https://synbiohub.org/public/igem/BBa_J23101/1">
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/BBa_J23101"/>
    <sbol:displayId>BBa_J23101</sbol:displayId>
    <sbol:version>1</sbol:version>
    <prov:wasDerivedFrom rdf:resource="http://parts.igem.org/Part:BBa_J23101"/>
    <prov:wasGeneratedBy rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <dcterms:title>BBa_J23101</dcterms:title>
    <dcterms:description>constitutive promoter family member</dcterms:description>
    <dcterms:created>2006-08-03T11:00:00Z</dcterms:created>
    <dcterms:modified>2015-08-31T04:08:40Z</dcterms:modified>
    <sbh:mutableProvenance>later</sbh:mutableProvenance>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/BBa_J23101/1"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <igem:partStatus>Released HQ 2013</igem:partStatus>
    <sbh:mutableDescription>later</sbh:mutableDescription>
    <igem:discontinued>false</igem:discontinued>
    <igem:dominant>true</igem:dominant>
    <igem:experience rdf:resource="http://wiki.synbiohub.org/wiki/Terms/igem#experience/Works"/>
    <igem:group_u_list>_52_</igem:group_u_list>
    <igem:m_user_id>0</igem:m_user_id>
    <igem:owner_id>483</igem:owner_id>
    <igem:owning_group_id>95</igem:owning_group_id>
    <igem:sampleStatus>In stock</igem:sampleStatus>
    <igem:status rdf:resource="http://wiki.synbiohub.org/wiki/Terms/igem#status/Available"/>
    <sbh:bookmark>true</sbh:bookmark>
    <sbh:mutableNotes>N/A</sbh:mutableNotes>
    <sbh:star>true</sbh:star>
    <dc:creator>John Anderson</dc:creator>
    <sbol:type rdf:resource="http://www.biopax.org/release/biopax-level3.owl#DnaRegion"/>
    <sbol:role rdf:resource="http://wiki.synbiohub.org/wiki/Terms/igem#partType/Regulatory"/>
    <sbol:role rdf:resource="http://identifiers.org/so/SO:0000167"/>
    <sbol:sequence rdf:resource="https://synbiohub.org/public/igem/BBa_J23101_sequence/1"/>
  </sbol:ComponentDefinition>
  <sbol:Sequence rdf:about="https://synbiohub.org/public/igem/BBa_J23101_sequence/1">
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/BBa_J23101_sequence"/>
    <sbol:displayId>BBa_J23101_sequence</sbol:displayId>
    <sbol:version>1</sbol:version>
    <prov:wasDerivedFrom rdf:resource="http://parts.igem.org/Part:BBa_J23101"/>
    <prov:wasGeneratedBy rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/BBa_J23101_sequence/1"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <sbol:elements>tttacagctagctcagtcctaggtattatgctagc</sbol:elements>
    <sbol:encoding rdf:resource="http://www.chem.qmul.ac.uk/iubmb/misc/naseq.html"/>
  </sbol:Sequence>
  <prov:Activity rdf:about="https://synbiohub.org/public/igem/igem2sbol/1">
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/igem2sbol"/>
    <sbol:displayId>igem2sbol</sbol:displayId>
    <sbol:version>1</sbol:version>
    <dcterms:title>iGEM to SBOL conversion</dcterms:title>
    <dcterms:description>Conversion of the iGEM parts registry to SBOL2.1</dcterms:description>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <dc:creator>Chris J. Myers</dc:creator>
    <dc:creator>James Alastair McLaughlin</dc:creator>
    <prov:endedAtTime rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2017-03-06T15:00:00.000Z</prov:endedAtTime>
  </prov:Activity>
</rdf:RDF>
https://synbiohub.org/public/igem/BBa_J23101_sequence: Too few values for property namespace. Expected 1, found 0
https://synbiohub.org/public/igem/BBa_J23101: Too few values for property namespace. Expected 1, found 0
https://synbiohub.org/public/igem/BBa_J23101_sequence: Less than 1 values on <https://synbiohub.org/public/igem/BBa_J23101_sequence>->:hasNamespace
https://synbiohub.org/public/igem/igem2sbol/1: Value does not have class :Identified
https://synbiohub.org/public/igem/BBa_J23101: Less than 1 values on <https://synbiohub.org/public/igem/BBa_J23101>->:hasNamespace
https://synbiohub.org/public/igem/BBa_J23101: Less than 1 values on <https://synbiohub.org/public/igem/BBa_J23101>->:hasNamespace
https://synbiohub.org/public/igem/BBa_J23101_sequence: Less than 1 values on <https://synbiohub.org/public/igem/BBa_J23101_sequence>->:hasNamespace
https://synbiohub.org/public/igem/BBa_J23101_sequence sbol3-10505: Sequence encoding is not in the recommended set
jakebeal commented 2 years ago

Note: the sequence encoding issues go the other way as well, from 3->2

jamesamcl commented 2 years ago

Thanks Jake! I'll check it out.

jakebeal commented 2 years ago

Another remapping that needs to be done: the ComponentDefinition BioPAX types needs to be remapped to Component SBO types.

jamesamcl commented 2 years ago

How did you validate? I wrote this conversion before there was an SBOL3 validator!

jakebeal commented 2 years ago

We have a validator in pySBOL3, much of which is bootstrapped off of the sbol-shacl (https://github.com/SynBioDex/sbol-shacl) generated from the sbol3 ontology. You can likely pull in the shacl for validation in sbolgraph as well --- that takes care of all the "syntactic" issues, leaving the more complex rules still to implement.