awslabs / SPARQL-CDTs

Specification of an extension to SPARQL for handling literals that capture composite values (lists, maps, etc.).
https://w3id.org/awslabs/neptune/SPARQL-CDTs/latest.html
Apache License 2.0
6 stars 0 forks source link

Performance compared to other approaches #11

Open rat10 opened 3 weeks ago

rat10 commented 3 weeks ago

In a paper about this proposal, published at ESWC 2024 [0], work by Daga et al [1] is mentioned which evaluates the performance of five different approaches to representing lists in RDF (RDF containers, RDF collections, a design pattern, explicit numbering, numbered properties). The comparison is performed on different database systems and the results show that performance by and large doesn't depend on the software but on the representation. Have you compared the performance of CDT to those other approaches on Jena (or AWS or any other system)? Could you extend the comparison in [1] with CDT?

[0] O. Hartig et al, Datatypes for Lists and Maps in RDF Literals, ESWC 2024, pdf [1] E. Daga, A. Merono-Penuela, and E. Motta. Sequential Linked Data: The State of Affairs. Semantic Web, 12(6):927–958, 2021. pdf

kasei commented 3 weeks ago

Our proposal doesn't prescribe how the CDT data is stored internally, whereas I think the Daga work restricts itself to encoding sequence-like data as RDF triples. So the two different works are working at different conceptual levels. Internally, a CDT implementation could use structures similar to those used in Daga, or keep the CDT literals as literals and lazily turn them into structured data as needed, or use more optimized data structures internally. At this point, our work isn't concerned with how it's implemented (though obviously that choice will impact performance on any given workload).