monarch-initiative / vertebrate-breed-ontology

https://monarch-initiative.github.io/vertebrate-breed-ontology/
9 stars 0 forks source link

Invalid URI with a pipe symbol causes Rust RDF parsers to fail on VBO #51

Closed cmungall closed 1 year ago

cmungall commented 2 years ago

The class http://purl.obolibrary.org/obo/VBO_0100150 has a terms:source wih an invalid URI

curl -L -s http://purl.obolibrary.org/obo/vbo.owl | grep -A5 -B5 -n '\|'
1171651-        <terms:contributor rdf:resource="https://orcid.org/0000-0002-1628-7726"/>
1171652-        <terms:contributor rdf:resource="https://orcid.org/0000-0002-4142-7153"/>
1171653-        <terms:contributor rdf:resource="https://orcid.org/0000-0002-5002-8648"/>
1171654-        <terms:contributor rdf:resource="https://orcid.org/0000-0002-5520-6597"/>
1171655-        <terms:contributor rdf:resource="https://orcid.org/0000-0002-9178-3965"/>
1171656:        <terms:source rdf:resource="https://cfa.org/breeds/|https://wcf.de/en/wcf-ems-code/"/>
1171657-        <terms:source rdf:resource="https://en.wikipedia.org/wiki/List_of_cat_breeds#Breeds"/>
1171658-        <terms:source rdf:resource="https://www.gccfcats.org/getting-a-cat/choosing/cat-breeds/"/>
1171659-        <terms:source rdf:resource="https://www.rareexoticfelineregistry.com/breed-recognition/"/>
1171660-        <terms:source rdf:resource="https://www.tica.org/breeds/browse-all-breeds"/>
1171661-        <terms:source rdf:resource="https://www.worldcatcongress.org/wp/cat_breed_comp_laper.php"/>

This causes rdftab to break mysteriously

✗ rdftab db/vbo.db.tmp < db/vbo.owl
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: RdfXmlError { kind: InvalidIri(IriParseError { kind: InvalidIriCodePoint('|') }) }', src/main.rs:57:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

splitting the URL into two annotations resolves the issue

sabrinatoro commented 1 year ago

one source was concatenated instead of being in 2 individual fields