RDFLib / rdflib-jsonld

JSON-LD parser and serializer plugins for RDFLib
Other
283 stars 71 forks source link

Parsing of @context string fails #100

Open jormalaaksonen opened 3 years ago

jormalaaksonen commented 3 years ago

Hello!

I encountered a problem with a piece of code that tried to parse context that is a plain URI string instead of a dict of terms and URIs. The code is below and, as can be seen, the first parse() with a dict argument works whereas the second one with the plain string argument fails. The attached patch corrects the issue, but it may be just a workaround instead of a real remedy.

Yours, Jorma

#! /usr/bin/env python3

import rdflib
import rdflib_jsonld
import sys
print(sys.version, rdflib.__version__, rdflib_jsonld.__version__)

rdflib.Graph().parse(data='{"@context": {"sch": "http://schema.org/"}}', format='json-ld')
rdflib.Graph().parse(data='{"@context": "http://schema.org/"}', format='json-ld')
3.8.6 (default, Sep 25 2020, 09:36:53) 
[GCC 10.2.0] 5.0.0 0.6.0-dev
Traceback (most recent call last):
  File "./rdflib_jsonld_context_test.py", line 9, in <module>
    rdflib.Graph().parse(data='{"@context": "http://schema.org/"}', format='json-ld')
  File "./venv/lib/python3.8/site-packages/rdflib/graph.py", line 1078, in parse
    parser.parse(source, self, **args)
  File "./venv/lib/python3.8/site-packages/rdflib_jsonld-0.6.0.dev0-py3.8.egg/rdflib_jsonld/parser.py", line 95, in parse
    to_rdf(data, conj_sink, base, context_data)
  File "./venv/lib/python3.8/site-packages/rdflib_jsonld-0.6.0.dev0-py3.8.egg/rdflib_jsonld/parser.py", line 107, in to_rdf
    return parser.parse(data, context, dataset)
  File "./venv/lib/python3.8/site-packages/rdflib_jsonld-0.6.0.dev0-py3.8.egg/rdflib_jsonld/parser.py", line 125, in parse
    context.load(l_ctx, context.base)
  File "./venv/lib/python3.8/site-packages/rdflib_jsonld-0.6.0.dev0-py3.8.egg/rdflib_jsonld/context.py", line 200, in load
    self._prep_sources(base, source, sources)
  File "./venv/lib/python3.8/site-packages/rdflib_jsonld-0.6.0.dev0-py3.8.egg/rdflib_jsonld/context.py", line 213, in _prep_sources
    source = source_to_json(source_url)
  File "./venv/lib/python3.8/site-packages/rdflib_jsonld-0.6.0.dev0-py3.8.egg/rdflib_jsonld/util.py", line 28, in source_to_json
    return json.load(StringIO(stream.read().decode('utf-8')))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
*** ./venv/lib/python3.8/site-packages/rdflib_jsonld/parser.py.orig 2020-12-27 09:23:07.000000000 +0200
--- ./venv/lib/python3.8/site-packages/rdflib_jsonld/parser.py  2020-12-27 11:02:27.972823101 +0200
***************
*** 122,127 ****
--- 122,129 ----
          elif isinstance(data, dict):
              l_ctx = data.get(CONTEXT)
              if l_ctx:
+                 if not isinstance(l_ctx, dict):
+                     l_ctx = { '': l_ctx }
                  context.load(l_ctx, context.base)
                  topcontext = True
              resources = data
jormalaaksonen commented 3 years ago

A bit more functional patch:

*** ./venv/lib/python3.8/site-packages/rdflib_jsonld/parser.py.orig     2020-12-27 09:23:07.000000000 +0200
--- ./venv/lib/python3.8/site-packages/rdflib_jsonld/parser.py  2020-12-27 11:02:27.972823101 +0200
***************
*** 122,127 ****
--- 122,129 ----
          elif isinstance(data, dict):
              l_ctx = data.get(CONTEXT)
              if l_ctx:
+                 if not isinstance(l_ctx, dict):
+                     l_ctx = { '@vocab': l_ctx }
                  context.load(l_ctx, context.base)
                  topcontext = True
              resources = data