Open c0c0n3 opened 5 years ago
As of PR #214, we switched to untyped arrays in the API spec for values returned in query responses. While this works, it can be improved: we may want to consider using typed arrays again. The array type in the spec will probably have to be the sum type S
of all types catered for by the NGSI spec, i.e. array[S]
where S = bool | number | ...
.
Stale issue message
Hi @chicco785, I would like to contribute on this issue. Please assign it to me.
hi @Ajaysamdyan, as by @c0c0n3 analysis this is an issue specific to crate db, so basically you need to looking into crate translator: https://github.com/orchestracities/ngsi-timeseries-api/blob/master/src/translators/crate.py
the thing may be a bit tricky to handle without messing up also sql_translator, but not impacting it would be the best option I think.
Hi @c0c0n3 , thanks for sharing the details and guidance. As per my primary investigation on cratedb, i have tested and verify that cratedb support the insertion of array object of different type like: text, Boolean, number etc. I have executed the below mentioned some queries for verification:
CREATE TABLE my_table_arrays ( tags array(text), objects array(object as (age integer, name text))); INSERT INTO my_table_arrays (tags, objects) values (null, [{age = 27, name = 'ABCD'}]);
Here, data is successfully stored into the database. So kindly let me know, is my investigation going in right direction ?
Hi @Ajaysamdyan, that's great, you're definitely heading in the right direction. One thing to consider is that my comments are ancient and probably only applicable to Crate 3 which we don't support anymore. Do new versions of Crate support inserting and then querying an arbitrary JSON array? If so then, we don't need the typed approach anymore. In other words, would this work with a recent version of Crate
CREATE TABLE t (x array)
INSERT INTO t (x) VALUES ( [ 1, "yo!", true, { k = 2, h = 'foo' } ] )
and if does, then how can you query column x
? can you e.g. drill into the JSON object x[3]
?
The reason why I'm mentioning this is that it'd simplify the Python implementation....
Hi @c0c0n3 , Please find the major blue points supported by latest crateDB version;
So, currently inserting and casting of a json object array of different type is not supported by crateDB. Kindly find the reference link for the same. https://crate.io/docs/crate/reference/en/4.6/general/ddl/data-types.html#data-type-object-json https://github.com/crate/crate/issues/11714
I had tested and verified it. kindly suggest me a way to proceed.
Hi @Ajaysamdyan, thanks for sharing your findings. I think I might've found a workaround that could let us store JSON arrays in Crate---tested with version 4.5.1. Here's a sum up.
To recap, here's a pseudo JSON (sum) type definition
JSON := null | bool | string | number | [JSON] | {JSON}
Notice the recursive definition of array []
. In JSON, an array has type array of JSON
so array members are allowed to be any JSON value in the above sum type. For example, all of these are valid JSON arrays:
[1, 2] -- a homogeneous-type array: all numbers
["yo!", true] -- a mixed-type array: strings and booleans
[{"k": 1}, 2] -- a mixed-type array: objects and numbers
-- a more complex mixed-type array: string, array with a string and a bool, number
["wada", ["wada", false], 3]
The problem is that is seems Crate doesn't support mixed-type arrays. So while Crate will swallow the first array without a hiccup, it'll moan about the others---to be honest, I think for good reasons, but my opinion is irrelevant here.
What to do then? We know Crate supports JSON-like objects, so we might be able to encode a JSON array into an object. The mapping could be like this (pseudo-code)
NGSI attribute:
an_array_attr {
type: array
value: [...]
}
Crate column:
an_array_attr OBJECT { value: array }
Like I said earlier, I've given this a try with Crate 4.5.1 and it seems to work for the most part but there are some corner cases we need to look into and some design issues to consider. Here's my test.
CREATE TABLE t (an_array_attr OBJECT(ignored));
INSERT INTO t (an_array_attr) VALUES ( '{"value": [1, 2]}'::object );
INSERT INTO t (an_array_attr) VALUES ( '{"value": ["yo!", true]}'::object );
INSERT INTO t (an_array_attr) VALUES ( '{"value": [{"k": 1}, 2]}'::object );
INSERT INTO t (an_array_attr) VALUES ( '{"value": ["wada", ["wada", false], 3]}'::object );
While all these inserts worked without a glitch, there still are things to consider which might make this solution unfeasible.
ignored
which means we lose indexing and hence it'll be inefficient to query the values inside the array. At the moment QL queries don't drill into arrays or objects so maybe this isn't too much of an issue, but I'm wondering if we could use dynamic
instead of ignore
and what the trade-offs would be.{value: [wada, 3]}
. That is, the second element of the original array ([wada, false]
) seems to have disappeared. Assuming the answers to (2) is yes, then is this a Crate bug?Hi @c0c0n3, Thanks for sharing the test cases, I have also performed this test and observed the same behavior of cratedb.
Please find the below mentioned test case of dynamic insertion:
CREATE TABLE t1 (an_array_attr OBJECT(DYNAMIC)); Pass
Homogeneous Type:
INSERT INTO t1 (an_array_attr) VALUES ( '{"value": [1, 2]}'::object ); Pass
Mixed_Array Type:
INSERT INTO t1 (an_array_attr) VALUES ( '{"value": ["yo!", true]}'::object ); Failed, Error: Validation failed for an_array_attr: Cannot cast {"value"=['yo!', true]} to type object]
INSERT INTO t1 (an_array_attr) VALUES ( '{"value": [{"k": 1}, 2]}'::object ); Failed, Error: Validation failed for an_array_attr: Cannot cast {"value"=[{"k"=1}, 2]} to type object]
INSERT INTO t1 (an_array_attr) VALUES ( '{"value": ["wada", ["wada", false], 3]}'::object ); Failed, Error; Validation failed for an_array_attr: Cannot cast {"value"=['wada', ['wada', false], 3]} to type object]
INSERT INTO t3 (an_array_attr) VALUES ( '{"value": ["yo!", true]}'::object ); Validation failed for an_array_attr: Mixed dataTypes inside a list are not supported. Found text and boolean]
Here, a new table t3 created using same dynamic insertion.
It seems that cratedb
does not support the above mixed array data types.
Please correct my understanding, if i am wrong.
Moreover, I have also asked the cratedb community
regarding the ignored case (comment 2and 3).
Kindly find the attached link for the same.
https://community.crate.io/t/is-cratedb-support-mixed-array-type-insertion-and-querying-of-the-column/809
Hi @Ajaysamdyan, that's brilliant, thanks for this and thanks for asking the Crate community about mixed-type support. Let's wait to see what the answer is and then we'll take it from there.
Hi @c0c0n3 , As per the discussion with crate community
today, it has been found that it is a cratedb
Admin UI
bug. And thay are working on to fix this issue.
Chat link ;
https://community.crate.io/t/is-cratedb-support-mixed-array-type-insertion-and-querying-of-the-column/809/
Hi @Ajaysamdyan
it is a cratedb Admin UI bug
Ha! I actually tested the queries through the admin UI too and just assumed the data wasn't there. Glad it's just a display issue and the data doesn't actually get zapped :-)
Unless @chicco785 has anything on the contrary, we could go ahead with the implementation I suppose?
Hi there,
the CrateDB 4.6.4 hotfix release will include Admin UI 1.19.2, which includes the fix https://github.com/crate/crate-admin/pull/745 by @proddata - thanks a stack!
The Docker image is already available through crate/crate:4.6.4
.
With kind regards, Andreas.
@c0c0n3 Is this still under active development? According to Crate-Docs: https://cratedb.com/docs/crate/reference/en/4.8/general/ddl/data-types.html#type-object it should easily work to use Arrays of Objects. Is this already supported in QL and only the docs are outdated?
@tstorek nope, it hasn't been implemented yet...
With #142 we introduced better support for saving structured values as objects in Crate. We should probably extend that line of work to include support for arrays of structured values, which should be saved as arrays of objects in Crate.
But as we are at it, why not consider typed arrays? Currently we support conversion from the following NGSI attribute types:
Boolean
,Number
,Integer
,Text
,DateTime
,ISO8601
,geo:json
,geo:point
,StructuredValue
. But when it comes to arrays, we convert array elements to strings so we only support storing arrays of strings in Crate. It'd be nice if we could support arrays of all the above types---i.e.array[bool]
,array[float]
,array[object]
, etc.See also: #36, #24.