ubc-systopia / Indaleko

Indaleko Project
GNU Affero General Public License v3.0
0 stars 1 forks source link

Windows Ingester should write edge file properly #11

Closed fsgeek closed 7 months ago

fsgeek commented 7 months ago

The existing ingester for windows local file systems is writing both the vertices and edges to the same file. This needs to be fixed so it writes the vertices to the Objects collection and the edges to the Relationships collection.

The resolution here is to create files with a specification of the correct collection, split these, and make sure things are working properly. While doing this I found that UUIDs have hyphens, which could interfere with the parsing. I switched those around to use raw HEX format instead, which also makes them smaller.

There is an issue with arangoimport choking on the name of the edge collection file (it gives a "file not found" but the same exact name works with other programs.

fsgeek commented 7 months ago

See commit 7a6c349da7b11da135c5abff9e9b06b47af25e1e.

This splits the files into two, labels them for their respective collections. The issue with file not found is still there, but I looked through arangoimport source (https://github.com/arangodb/arangodb.git). The problem seems to be in a library function (::StatResultTypeFile) that just can't open the file. I found if I made the name a bit smaller it worked so I suspect there is a fixed size buffer somewhere that is causing the problem.

In the meantime, just rename or link to the file and it works fine.