Closed azaroth42 closed 5 years ago
There are several bugs that I've found:
Firstly, in the API call, it uses the broken command logger.exception()
which fails as it does not include an error to send to the logger. It uses it twice, and in one case, this is used within a catchall exception which is what is masking all the other errors. I recommend expanding this, and removing the catchall catch Exception
block entirely.
(https://github.com/archesproject/arches/blob/master/arches/app/views/api.py#L303)
Next part of the fix is simple and does correct the encoding in the RDFlib Graph object. (Involves a few datatype edits)
This is what the serialization of the RDF Graph looks like, once the datatype fixes are in:
<http://localhost:8000/resources/281d85dc-c377-11e9-8d2f-0242ac170004> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.cidoc-crm.org/cidoc-crm/E4_Period> .
<http://localhost:8000/resources/281d85dc-c377-11e9-8d2f-0242ac170004> <http://purl.org/dc/terms/relation> "Rottl\\u00E4nder, R." .
(NB if sent to stdout by a logger command, the double will turn into a single . eg)
>>> g.serialize(format='nt')
'<http://example.org/1> <http://www.w3.org/2000/01/rdf-schema#label> "Rottl\\u00E4nger" .\n\n'
>>> print(g.serialize(format='nt'))
<http://example.org/1> <http://www.w3.org/2000/01/rdf-schema#label> "Rottl\u00E4nger" .
And this is what it looks like after importing this into pyld via the from_rdf command:
logger.debug(js)
<-- the pyld object
[{'http://purl.org/dc/terms/relation': [{'@value': 'Rottl\\u00E4nder, R.'}], '@id': 'http://localhost:8000/resources/281d85dc-c377-11e9-8d2f-0242ac170004', '@type': ['http://www.cidoc-crm.org/cidoc-crm/E4_Period']}]
Note the double encoding when being logged to the commandline. There is a line in pyld that is worrying me: https://github.com/digitalbazaar/pyld/blob/master/lib/pyld/jsonld.py#L2997 this use of str
could be problematic in this case.
In PyLD, _is_string
is just a wrapper for return isinstance(v, basestring)
... which is mapped to just str
in 3.x ... which should catch any real string/unicode values, I think.
Can you put the fixes for the datatypes into a branch for testing?
Thanks Ben!
Should be resolved by #5181 merge
Describe the bug
When either uploading or downloading content that has unicode characters, the JSON-LD code raises an unhandled exception. Can't tell what it is, due to #5116 :(
It is, however, trivial to reproduce:
To Reproduce
Creation sub-event for Author by 'Rottländer, R.'
Tagging has high, as this is a blocker for any real data.