nickna / Neighborly

An open-source vector database
MIT License
15 stars 2 forks source link

Improve Disk-Backed List Performance by Minimizing I/O Operations and Utilizing Memory-Mapped Files #13

Closed nickna closed 3 months ago

nickna commented 4 months ago

Description: The current implementation of the disk-backed list in Neighborly may be experiencing performance bottlenecks due to frequent and inefficient I/O operations. This task aims to optimize the performance by minimizing these operations and utilizing memory-mapped files.

Objective:

Proposed Changes:

  1. Analyze Current I/O Operations:
  1. Implement Memory-Mapped Files:

Refactor the current disk-backed list to use memory-mapped files for data storage and access. Ensure that the new implementation can handle large datasets efficiently without exceeding memory limits.

Optimize Data Access Patterns:

Develop and run benchmarks to compare the performance of the current implementation with the optimized version. Ensure thorough testing to verify data integrity and consistency with the new implementation. Resources:

Optimizing I/O Performance Impact: This enhancement is expected to:

How to Contribute: Fork the repository and create a new branch for your changes. Implement the proposed changes in the new branch. Ensure all new and existing tests pass. Submit a pull request with a detailed description of the changes and performance improvements observed. If you have any questions or need further clarification, feel free to ask.

References: Relevant code files: VectorList.cs Memory-mapped file documentation: Microsoft Docs