Open steffansluis opened 6 years ago
Good point, language strings are implemented for quite some time, but we don't have any queries for them yet. Also, KV/memstore has no indexes for lang
, while SQL/Mongo can be easily extended to provide support for these queries by enabling an index on this field.
This feature is not implemented mostly because we wanted to postpone this work after reification is done, since lang
will look like a custom metadata field on string nodes, and all custom metadata queries will work for language as well. But it makes sense to add @lang
in current implementation, since it's not that hard and internal details about implementation will change under the hood of the query and will not affect GraphQL.
And, more details should be discussed before starting this. For example, what is the default behavior for name @lang("en")
query in case "en" language string does not exists? Should it assume that there is no such predicate, or fallback to string without language tags? What if only German string is available, but not a generic version?
SPARQL treats language tagged strings and non-language tagged strings as different values because they are not the same RDF literals. This makes sense, since the RDF spec also states that the datatype differs for language-tagged strings. As such, I would think that when not using the @lang
directive the name
has to be the regular type of string and with the @lang
directive the name
would have to be the language-tagged type of string. Therefore, there is no generic version of a language-tagged string since they are distinct values within their type, so no fallback, just a null result if a particular language doesn't have a value defined. The specification of the language is standardized as well by the RDF spec.
Hey, I need to get literals language literals in my Gizmo queries but I can't. I tried to look into the code of Gizmo but I was not able to figure out where the values are formatted and why LangString
is not formatted correctly. @dennwc can you help me out?
Found it! My fix is here: https://github.com/cayleygraph/cayley/pull/819
@iddan If the problem is in Gizmo specifically, I propose to add some accessor on JS side instead. Or maybe wrap language string in a small object with those two fields (on JS side, again)?
Am I missing something here?
Currently, as a Gizmo user, I assume values to always be a regular string or an n3 valid value because this is the current behaviour with strings and URIs. I think it would have been better if Gizmo returned a JSON-LD valid string. People have tried to spec RDF-JSON but they decided JSON-LD is a better approach. Anyway, it should be consistent.
@iddan I agree, but for regular strings and IRIs the conversion is trivial - it's just a single string after all. For LangString
it's different - we have two fields now.
And, at least from my point of view, the "Bob"@en
is not exactly easy to use from JS. Right now Cayley converts it to Bob
(no quotes, unescaped) when "flattening" results.
But I agree, JSON encoding/decoding of RDF values is not that good or consistent in Cayley. We should use whatever spec is available for it. JSON-LD as you mentioned may be a good target spec.
The JSON-LD value would be:
{ "@language": "en", "@value": "Bob" }
I need a way to get the language for string results, so what do you suggest to do?
I think we can start by adding support for those values to gizmo.toQuadValue
. This way it would be possible to create those values without using lang()
JS helper. The second step is to intercept all calls to vm.ToValue()
like this one and convert quad.LangString
to the JS object you mentioned.
Later me should consider changing our JSON I/O formats to accept/emit those values as well.
I opened a different issue for changing to JSON-LD for further discussion: https://github.com/cayleygraph/cayley/issues/820
@dennwc as far as I understand we currently load language strings correctly from files and when inserting quads through the UI.
@iddan Correct. The only issues is a query-side support. Right now it's possible to query for exact match, but not with a wildcard in the language, for example.
Got it.
https://github.com/cayleygraph/cayley/pull/834 solved querying for lang strings with regex filter
I noticed that language tagged strings are already supported internally in Cayley. It would be nice to be able to use this functionality through the HTTP APIs somehow (mostly Gizmo and GraphQL). I'm not entirely sure yet what would be the best way to include this. I would think it work somewhat similarly to the labeled sub-graphs, so I could imagine a
Lang
directive for Gizmo and a@lang
and@language
directive for GraphQL (similar to what I proposed in #614). Maybe it makes more sense though to have it work like the@opt
directive where it doesn't propagate implicitly to the nested query. In any case, curious to hear what you think about it!