Mayil-AI-Sandbox / kuzudb_jan15

MIT License
0 stars 0 forks source link

Enable `.nt` file storage (hashtag2814) #16

Open vikramsubramanian opened 4 months ago

vikramsubramanian commented 4 months ago

N-Triples files are valid Turtle files. They are the simplest of RDF formats where each triple is written one one line and without any prefix or base directives/shortening (so each IRI is written as full IRIs) and no grouping of triples by subject as in Turtle. Currently if I do this, I get an error:

kuzu> copy wklx from "/home/lc/Developer/semih/rdf-data/wiki-data/latest-lexemes-nt.nt";
Error: Copy exception: Unsupported file type .nt

However if I rename the file to latest-lexemes-nt.ttl then the loading works. I also tested that the Turtle and N-Triple version of the latest Wikidata lexemes dataset gives similar counts (not exact but very close).
)

mayil-ai[bot] commented 4 months ago

Summary: Unable to copy .nt file for storage, but works when renamed to .ttl.

Possible Solution

Based on the provided information, the issue is that the Kùzu database system does not recognize .nt (N-Triples) files as a supported RDF format when using the copy command, even though it should. The workaround of renaming .nt files to .ttl (Turtle) suggests that the system supports Turtle files but not N-Triples, despite both being valid RDF formats.

To resolve the issue:

Note: The provided code snippets do not contain the exact location where the copy command is implemented or where the file extension validation occurs. You will need to search the codebase for the relevant sections to apply the above solution.

Code snippets to check