Mayil-AI-Sandbox / kuzudb_jan15

MIT License
0 stars 0 forks source link

Better Python error handling message when (header=true) is not passed during `COPY FROM` (hashtag2737) #44

Open vikramsubramanian opened 7 months ago

vikramsubramanian commented 7 months ago

Consider a case where we're trying to insert LivesIn relations to an edge table:(:Person)-[:LivesIn]->(:City)

The data for the lives_in.csv is like this:

person_id,city_id
1,2
2,1

The following command is correct, and loads in the data when considering the header row.

COPY LivesIn FROM 'lives_in.csv' (header=true);

However, if the user forgets to specify the (header=true)clause, it will fail as expected, but the error message is not helpful:

Traceback (most recent call last):
  File "/code/kuzu-debug/load_data.py", line 59, in <module>
    conn.execute(load_lives_in_edges)
  File "/code/kuzu-debug/.venv/lib/python3.11/site-packages/kuzu/connection.py", line 80, in execute
    _query_result = self._connection.execute(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Conversion exception: Cast failed. person_id is not in INT64 range.

The above error message makes it seem like the data is incorrect, rather than something wrong with the way we specified it in the edge insertion query.

A better error message would be:

RuntimeError: Conversion exception: Cast failed. Expected INT64, got "person_id"

Stating it this way clearly states what the parser expects, and what it found, making user experience during debugging much better.

)

mayil-ai[bot] commented 7 months ago

Summary: Improving error message for missing (header=true) clause in COPY FROM command in Python.

Possible Solution

Based on the provided information, the issue seems to be related to the COPY command not being executed with the header=true option, which is causing a cast failure when trying to interpret the first row of the CSV file as data instead of as headers. To resolve this issue:

Here is a concise solution:

Code snippets to check