kg-construct / rml-core

RML-Core: Main features for RDF generation with RML
https://w3id.org/rml/core/spec
Creative Commons Attribution 4.0 International
12 stars 8 forks source link

Section 6.6.1 Automatically deriving datatypes is underspecified #91

Open chrdebru opened 6 months ago

chrdebru commented 6 months ago

Section "6.6.1 Automatically deriving datatypes" is underspecified. The test cases assume that all values are string literals. The spec does mention automatically deriving datatypes for SQL with no "conversion tables" as specified in R2RML.

Proposal 1:

Proposal 2:

Proposal 2 would then allow 6.6.1 be rewritten as

"rml-core does not support the automatic derivation of data types and mappings should explicitly include data type mappings if one wishes to generate literals other than xsd:string. The generation of derived data types is supported and specified by the rml-io specification."

pmaria commented 6 months ago

Strong preference for proposal 2. My preference would be to introduce separate notes for each reference formulation wherein these details can be described.

chrdebru commented 6 months ago

Then that means that some test cases need to be adapted (e.g., some JSON values have integer values that should be transformed as such). And somebody creating those notes, of course.

dachafra commented 6 months ago

Strong preference for proposal 2. My preference would be to introduce separate notes for each reference formulation wherein these details can be described.

+1

DylanVanAssche commented 6 months ago

+1 for proposal 2

bjdmeest commented 6 months ago

+1 for proposal 2, but in line with the respective specs, JSON has following primitive types:

for XML: why not take over the datatype as specified in XML (sure it'll be XSD in most cases, but all other cases should also be covered no?)

chrdebru commented 6 months ago

The datatype for XML without a schema would be a string and one of the schemas if it can be looked up. XML data types only "exist" in the schema. XML DTDs, another XML schema language, only has character data which are strings. The problem, however, is that you then have two cases:

I believe core should support the basic case, and IO could handle both

dachafra commented 1 month ago

I feel this issue is already being addressed in the rml-io-registry repository, right? Or do we want a default/basic behaviour in the rml-core? @pmaria