Mayil-AI-Sandbox / kuzudb_jan15

MIT License
0 stars 0 forks source link

Removing pid of R-R and R-L relationship tables in RDFGraphs as a visible property to users (hashtag2781) #29

Open vikramsubramanian opened 8 months ago

vikramsubramanian commented 8 months ago

The internalID type pid property of the R-R and R-L relationship tables is a system-level optimization. When users query them, they are unable to query them because of binding errors that are confusing or they get confusing outputs. For example:

MATCH (a)-[p:UniKG_rt]->(o) 
WHERE a.iri = " 
RETURN p.pid;

Output:
--------------------------
| p.pid                  |
--------------------------
| 18446744073709551615:1 |
--------------------------
| 18446744073709551615:9 |
--------------------------

Above, 18446744073709551615 looks like a NULL placeholder value. Or:

MATCH (a)-[p:UniKG_rt]->(o) 
WHERE a.iri = " AND p.iri = "  
SET p.pid=7 RETURN a.iri, p.iri, o.iri;
Error: Binder exception: Expression 7 has data type INT64 but expected INTERNAL_ID. Implicit cast is not supported.

Instead we should explicitly say that pids should not be queried and modified.

Instead, we have a feature that you can call an iri property on these relationships. In reality these relationship tables do not have iri properties but we internally don't trigger binding errors because we would like users to query and obtain string IRIs. So we can do this:

MATCH (a)-[p:UniKG_rt]->(o) 
RETURN p.iri

Output:
---------------------------------------------------
| p.iri                                           |
---------------------------------------------------
|                    |
---------------------------------------------------
|                     |
---------------------------------------------------
...

Or:

kuzu> MATCH (a)-[p:UniKG_lt]->(o)  
WHERE p.iri = " 
RETURN a.iri, p.iri, o.val;
------------------------------------------------------------------
| a.iri                      | p.iri                     | o.val |
------------------------------------------------------------------
|  |  | 30    |
------------------------------------------------------------------

But when a user looks at these tables, they see this:

kuzu> call table_info('UniKG_lt') return *;
------------------------------------
| property id | name | type        |
------------------------------------
| 1           | pid  | INTERNAL_ID |
------------------------------------

This behavior is confusing. I think we should just hid pid completely from the user. That is the following should be our behavior:

mayil-ai[bot] commented 8 months ago

Summary: The pid property in the R-R and R-L relationship tables in RDFGraphs should be hidden from users and not be queryable or modifiable.

Possible Solution

Based on the provided information and code snippets, the issue seems to be related to the handling of pid and iri properties in the context of the Kùzu graph database. The error message indicates that there is a type mismatch where an INT64 type is being used where an INTERNAL_ID type is expected, and implicit casting is not supported.

To resolve the issue:

By making these changes, you should be able to resolve the type mismatch error and ensure that the pid property is handled correctly as an internal identifier, while the iri property is exposed to users as a virtual, user-friendly identifier.

Code snippets to check