RDFLib / rdflib

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
https://rdflib.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.15k stars 555 forks source link

Incorrect JSON-LD serialization of lists starting with int/double zero literal #2009

Open svenschneider opened 2 years ago

svenschneider commented 2 years ago

If a list that starts with a zero literal ("0"^^xs:integer or "0"^^xs:double) is serialized to JSON-LD the result seems to be incorrect. The following example demonstrates this for two lists: [1, 0] and [0, 1].

import rdflib
from rdflib.namespace import RDF

EX = rdflib.Namespace("http://example.org/")

zero = rdflib.Literal(0.0)
one = rdflib.Literal(1.0)

g = rdflib.Graph()

bn1 = rdflib.BNode()
c1 = rdflib.collection.Collection(g, bn1, [one, zero])
g.add((EX.s, EX.p1, bn1))

bn2 = rdflib.BNode()
c2 = rdflib.collection.Collection(g, bn2, [zero, one])
g.add((EX.s, EX.p2, bn2))

print(g.serialize(format="json-ld", indent=4))

Running this example on the master branch actually produces the following output:

[
    {
        "@id": "http://example.org/s",
        "http://example.org/p1": [
            {
                "@list": [
                    {
                        "@value": 1.0
                    },
                    {
                        "@value": 0.0
                    }
                ]
            }
        ],
        "http://example.org/p2": [
            {
                "@id": "_:N0952ac52472941a59065bc0d92837b0e"
            }
        ]
    },
    {
        "@id": "_:N0952ac52472941a59065bc0d92837b0e",
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#first": [
            {
                "@value": 0.0
            }
        ],
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#rest": [
            {
                "@list": [
                    {
                        "@value": 1.0
                    }
                ]
            }
        ]
    }
]

The result I would expect here however is:

[
    {
        "@id": "http://example.org/s",
        "http://example.org/p1": [
            {
                "@list": [
                    {
                        "@value": 1.0
                    },
                    {
                        "@value": 0.0
                    }
                ]
            }
        ],
        "http://example.org/p2": [
            {
                "@list": [
                    {
                        "@value": 0.0
                    },
                    {
                        "@value": 1.0
                    }
                ]
            }
        ]
    }
]

The problem seems to be caused by the second condition on this line. Changing

if l_ != RDF.nil and not graph.value(l_, RDF.first):

to

if l_ != RDF.nil and RDF.first not in graph.predicates(l_):

solves this issue for me. But I'm not sure if that breaks anything else. If desired I can create a pull request with that change.

Please let me know if I can help with any further information or input.

Thanks and regards Sven

svenschneider commented 2 years ago

To add to my previous point there seems to be one more problematic case on this line. Similarly to the previous suggestion that line supposedly should be changed from

containers = [LIST, None] if graph.value(o, RDF.first) else [None]

to

containers = [LIST, None] if RDF.first in graph.predicates(o) else [None]
devonsparks commented 9 months ago

I just stumbled on this same problem in a different context using rdflib==7.0.0 and get the same results @svenschneider does using the original example. What additional regression testing is recommended before the proposed change could be merged?