RDFLib / rdflib

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
https://rdflib.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.17k stars 555 forks source link

nquads and trix should serialize non-context aware stores also. #1892

Open aucampia opened 2 years ago

aucampia commented 2 years ago

While writing some tests I noticed:

================================================================================= FAILURES ==================================================================================
____________________________________________________________ test_serialize_parse[nquads-TRIPLE-BINARY_IO-None] _____________________________________________________________
Traceback (most recent call last):
  File "/home/iwana/sw/d/github.com/iafork/rdflib.cleanish/test/test_serializers/test_serializer.py", line 383, in test_serialize_parse
    serialize_result = graph.serialize(
  File "/home/iwana/sw/d/github.com/iafork/rdflib.cleanish/rdflib/graph.py", line 1127, in serialize
    serializer = plugin.get(format, Serializer)(self)
  File "/home/iwana/sw/d/github.com/iafork/rdflib.cleanish/rdflib/plugins/serializers/nquads.py", line 15, in __init__
    raise Exception(
Exception: NQuads serialization only makes sense for context-aware stores!
========================================================================== short test summary info ==========================================================================
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-BINARY_IO-None] - Exception: NQuads serialization only makes sense for context-aware s...
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
======================================================================= 1 failed, 10 xfailed in 0.13s =======================================================================

I don't see why this should be the case, nquads should just encode everything with the default context, which will result in something that looks like ntriples.

aucampia commented 2 years ago
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-None-None] - Exception: NQuads serialization only makes sense for context-aware stores!
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-None-utf-8] - Exception: NQuads serialization only makes sense for context-aware stores!
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-PATH-None] - Exception: NQuads serialization only makes sense for context-aware stores!
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-PATH-utf-8] - Exception: NQuads serialization only makes sense for context-aware stores!
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-STR_PATH-None] - Exception: NQuads serialization only makes sense for context-aware st...
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-STR_PATH-utf-8] - Exception: NQuads serialization only makes sense for context-aware s...
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-PURE_PATH-None] - Exception: NQuads serialization only makes sense for context-aware s...
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-PURE_PATH-utf-8] - Exception: NQuads serialization only makes sense for context-aware ...
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-BINARY_IO-None] - Exception: NQuads serialization only makes sense for context-aware s...
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[nquads-TRIPLE-BINARY_IO-utf-8] - Exception: NQuads serialization only makes sense for context-aware ...
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-None-None] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-None-utf-8] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-PATH-None] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-PATH-utf-8] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-STR_PATH-None] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-STR_PATH-utf-8] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-PURE_PATH-None] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-PURE_PATH-utf-8] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-BINARY_IO-None] - Exception: TriX serialization only makes sense for context-aware stores
FAILED test/test_serializers/test_serializer.py::test_serialize_parse[trix-TRIPLE-BINARY_IO-utf-8] - Exception: TriX serialization only makes sense for context-aware stores
ghost commented 2 years ago

For nquads, it's a consequence of the implementation of looping over the store contexts in https://github.com/RDFLib/rdflib/blob/eba13739e1d24b3e52697a1ec9545a361116951f/rdflib/plugins/serializers/nquads.py#L37

        for context in self.store.contexts():
            for triple in context:
                stream.write(
                    _nq_row(triple, context.identifier).encode(encoding, "replace")
                )

Your original issue reflects an infelicity of ConjunctiveGraph not being quite equivalent to Dataset in terms of the ConjunctiveGraph default context is not nameless, which results in quads being quads and not triples:

>>> cg = ConjunctiveGraph()
>>> cg.bind("", URIRef("urn:example:"))
>>> cg.add((tarek, likes, pizza))
>>> cg.add((michel, likes, pizza))
>>> print(cg.serialize(format='nquads'))
<urn:example:michel> <urn:example:likes> <urn:example:pizza> _:N31a3b93b41dc46e6af9dbd6f54eb48ed .
<urn:example:tarek> <urn:example:likes> <urn:example:pizza> _:N31a3b93b41dc46e6af9dbd6f54eb48ed .

When the Dataset re-work is merged and ConjunctiveGraph is retired, the situation will be:

>>> ds = Dataset()
>>> ds.bind("", URIRef("urn:example:"))
>>> ds.add((tarek, likes, pizza))
>>> ds.add((michel, likes, pizza))
>>> print(ds.serialize(format='nquads'))
<urn:example:michel> <urn:example:likes> <urn:example:pizza>  .
<urn:example:tarek> <urn:example:likes> <urn:example:pizza>  .