Mayil-AI-Sandbox / kuzudb_jan15

MIT License
0 stars 0 forks source link

xsd prefix is not recognized in Turtle file when parsing RDF literals (hashtag2790) #26

Open vikramsubramanian opened 4 months ago

vikramsubramanian commented 4 months ago

I was doing some debugging to double check that the data types of RDF literals are parsed correctly. I have the following file:

 kz: < .
 xsd: < .

kz:Waterloo a kz:City ;
            kz:name "Waterloo" ;
            kz:population 10000.0 ;
            kz:population2 "20000.0"^^xsd:decimal ;
            kz:foundedIn "30000"^^< .

Copying this into an RDFGraph will result in the following set of triples:

---------------------------------------------------------------------------------------------------
| a.iri                          | p.iri                                           | o.val        |
---------------------------------------------------------------------------------------------------
|  |                       | Waterloo     |
---------------------------------------------------------------------------------------------------
|  |                 | 10000.000000 |
---------------------------------------------------------------------------------------------------
|  |                | 20000.0      |
---------------------------------------------------------------------------------------------------
|  |                  | 30000.000000 |
---------------------------------------------------------------------------------------------------
|  |  |              |
---------------------------------------------------------------------------------------------------

So the behavior is this:

First we should recognize prefix namespaces in literal datatype tags as well. Second, I think we should recognize xsd in datatype tags even if the prefix xsd is completely missing. I tested this in GraphDB and they do recognize xsd even if its missing (though they do recognize rdf, rdfs, and owl too).

If you prefer: do not recognize xsd without the prefix for now and instead open an issue to have a configuration to support common namespaces including this. If you choose this, make sure you take a note reminding not to forget the xsd prefix and to add tests for it when parsing literal data types. )

mayil-ai[bot] commented 4 months ago

Summary: The Turtle file parser does not recognize the xsd prefix for RDF literals, leading to incorrect parsing of data types.

Possible Solution

Based on the provided information, the issue is with the incorrect parsing of xsd:decimal datatype in RDF literals. The relevant code snippet is from rdf_utils.cpp where the addRdfLiteral function is defined. The function checks the datatype of the literal and attempts to cast it to the appropriate C++ type.

To resolve the issue:

Here is a concise solution:

Code snippets to check