nickna / Neighborly

An open-source vector database
MIT License
9 stars 2 forks source link

Migrate DiskBackedList to custom 64-bit IList / IEnumerable interface #33

Open nickna opened 4 weeks ago

nickna commented 4 weeks ago

The maximum number of Vector records Neighborly can store is 2,147,483,647 as its index is determined by a System.Int32. In order to move to a System.Int64 (with a max value of 9,223,372,036,854,775,807), we'll need to create our own IList64 and IEnumerable64 interface.

This will break compatibility with things like LINQ and the use of a traditional enumerator (e.g. foreach) . We'll need to understand the tradeoffs before implementing.

hangy commented 3 weeks ago

I think an IList64 could still imlement IEnumerable<T>, so that foreach and most LINQ methods will work fine. IEnumerable<T> doesn't dictate the size of the object that's enumerated on. Devs "just" have to be careful not to call ToList or similar methods that try to allocate everything at once.

OT: It might be interesting to implement IAsyncEnumerable<T>, so that an async version of ReadFromDisk could be awaited?

edit: LargeList uses segments internally, and BigList just uses BigArray as the backing storage.

nickna commented 3 weeks ago

You bring up a good point about IAsyncEnumerable.

While I was focused on expanding the theoretical number of records, improving async support is crucial.
Prioritizing async improvements before expanding capacity makes sense for better overall system performance.