terminusdb / terminusdb

TerminusDB is a distributed database with a collaboration model
https://terminusdb.com
Apache License 2.0
2.79k stars 108 forks source link

Random subdocuments generate an incorrect key #734

Open matko opened 3 years ago

matko commented 3 years ago

While working on the github star example project, we created a schema where a Repository contains a set of Stars, which are subdocuments. After inserting a bunch of data, we got this back (information cut to just a single star subdocument and user anonymized):

    {
      "@id":"terminusdb:///data/Star/f608b01a007687c969da76a2801e500221e45694d634218f5d60e9975b008c53",
      "@type":"terminusdb:///schema#Star",
      "terminusdb:///schema#star_user":"terminusdb:///data/User/xxxxx",
      "terminusdb:///schema#starred_at":"2019-07-25T15:55:25Z"
    },

I would have expected the id to not just include Star/, but actually Repository/terminusdb_terminusdb/Star/f608b01a007687c969da76a2801e500221e45694d634218f5d60e9975b008c53

matko commented 3 years ago

Incidentally it's unclear to me if the random id was generated by the python client or by the server code. Either way, the server code should have validated this to be correct. (Cheuk says it's the server generating the id)

Cheukting commented 3 years ago

That is generated by the backend. However, will there be a case that a subdocument is inserted into two different objects?

matko commented 3 years ago

That is generated by the backend. However, will there be a case that a subdocument is inserted into two different objects?

No, that would be an error and should be checked by the schema checker. The point of a subdocument is that it is only contained by one document, which is its owner.

Cheukting commented 3 years ago

What if I move a subdocument from one object to another? Should it be a "copy and delete" (thus has a different id) operation?

spl commented 3 years ago

To reproduce the issue:

#!/bin/bash
set -ex

HOST='http://admin:root@localhost:6363'

# Create database
xh DELETE "$HOST/api/db/admin/tdb" label=l comment=c
xh "$HOST/api/db/admin/tdb" label=l comment=c

# Create schema
xh "$HOST/api/document/admin/tdb" graph_type==schema author==a message==m <<EOF
{"@id":"Child","@key":{"@type":"Random"},"@type":"Class","@subdocument":[]}
{"@id":"Parent","@type":"Class","children":{"@type":"Set","@class":"Child"}}
EOF

# Insert instance
xh "$HOST/api/document/admin/tdb" author==a message==m <<EOF
{"@type":"Parent","children":[{"@type":"Child"}]}
EOF

# Get instance
xh "$HOST/api/document/admin/tdb"

Output from the last command:

{
    "@id": "Parent/3fae67a24fa7f2f9192ff3e1d476645412f7615e676d82804f1a12b0de86213f",
    "@type": "Parent",
    "children": [
        {
            "@id": "Child/5ce9250060da7c1409a5f21973579b4f08d70cbc547f91c5f891489135901fe0",
            "@type": "Child"
        }
    ]
}
spl commented 3 years ago

We could use https://github.com/terminusdb/terminusdb/issues/747 for migrating existing documents.