tdwg / abcd

Access to Biological Collection Data (ABCD)
https://abcd.tdwg.org
13 stars 5 forks source link

use of spaces in local names results in invalid URIs #11

Open baskaufs opened 5 years ago

baskaufs commented 5 years ago

In the abcd_concepts.owl file, values for vann:termGroup have local names that contain spaces. For example, line 20 has:

<vann:termGroup rdf:resource="Specimen Unit"/>

When combined with the base namespace for the ontology (http://rs.tdwg.org/abcd/terms/), this forms a URI object for the triple with a space in it: http://rs.tdwg.org/abcd/terms/Specimen Unit. This can be seen when the OWL file is loaded into a triplestore and queried. Here's an example:

        <result>
            <binding name='term'>
                <uri>http://rs.tdwg.org/abcd/terms/Accession</uri>
            </binding>
            <binding name='group'>
                <uri>http://rs.tdwg.org/abcd/terms/Specimen Unit</uri>
            </binding>
        </result>

Since URIs with unescaped spaces are invalid, it would be safer to replace the local names having spaces with ones using CamelCase.

I didn't check for other places where this happens, so it might be in other places than the values of vann:termGroup. That's just where I noticed it.

SArndt-TIB commented 1 week ago

Others reported by rdflib 7.0.0 are:

WARNING:rdflib.term:http://rs.tdwg.org/abcd/terms/Data Set does not look like a valid URI, trying to serialize this will break.
WARNING:rdflib.term:http://rs.tdwg.org/abcd/terms/Measurement Or Fact does not look like a valid URI, trying to serialize this will break.
WARNING:rdflib.term:http://rs.tdwg.org/abcd/terms/Multimedia Object does not look like a valid URI, trying to serialize this will break.
WARNING:rdflib.term:http://rs.tdwg.org/abcd/terms/Specimen Unit does not look like a valid URI, trying to serialize this will break.
WARNING:rdflib.term:http://rs.tdwg.org/abcd/terms/Unit Type Classes does not look like a valid URI, trying to serialize this will break.