Closed dhruv-anand-aintech closed 6 months ago
37ba588dd3
)[!TIP] I can email you next time I complete a pull request if you set up your email here!
I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.
The following PRs were mentioned in the issue: # Pull Request #77 ## Title: lancedb support ## Files changed: README.md src/vdf_io/export_vdf/lancedb_export.py src/vdf_io/import_vdf/lancedb_import.py src/vdf_io/notebooks/lance-qs.ipynb Be sure to follow the PRs as a reference when making code changes. If the user instructs you to follow the referenced PR, limit the scope of your changes to the referenced PR.
requirements.txt
✓ https://github.com/AI-Northstar-Tech/vector-io/commit/27aed811cc83300be65b567a92323f86bdd37523 Edit
Modify requirements.txt with contents: Add the following line to include the turbopuffer[fast] package: turbopuffer[fast]
src/vdf_io/names.py
✓ https://github.com/AI-Northstar-Tech/vector-io/commit/27aed811cc83300be65b567a92323f86bdd37523 Edit
Modify src/vdf_io/names.py with contents: In the DBNames class, add a new constant for the Turbopuffer database name: TURBOPUFFER = "turbopuffer"
src/vdf_io/util.py
✓ https://github.com/AI-Northstar-Tech/vector-io/commit/27aed811cc83300be65b567a92323f86bdd37523 Edit
Modify src/vdf_io/util.py with contents: In the db_metric_to_standard_metric dictionary, add a new entry for Turbopuffer's distance metrics mapping. Use the Distance enum from qdrant_client.http.models for the standard metric values:
DBNames.TURBOPUFFER: { "cosine_distance": Distance.COSINE, "euclidean_distance": Distance.EUCLID, "dot_product": Distance.DOT, }
src/vdf_io/util.py
✓ https://github.com/AI-Northstar-Tech/vector-io/commit/27aed811cc83300be65b567a92323f86bdd37523 Edit
Modify src/vdf_io/util.py with contents: Implement a new function to create Python classes for the index being exported from Turbopuffer. It should take the index name and schema as input and generate a Python class with the appropriate fields and data types.
src/vdf_io/turbopuffer.py
✓ https://github.com/AI-Northstar-Tech/vector-io/commit/27aed811cc83300be65b567a92323f86bdd37523 Edit
Create src/vdf_io/turbopuffer.py with contents: Create a new module for Turbopuffer specific functionality.
Import the necessary modules: import turbopuffer as tpuf from vdf_io.names import DBNames from vdf_io.util import standardize_metric, clean_documents
Implement the make_parser function to add Turbopuffer specific command line options for export and import.
Implement the export_vdb function:
Implement the import_vdb function:
In the export_vdb and import_vdb functions, use the input() function to interactively prompt the user for Turbopuffer specific options that were not provided via command line arguments.
I have finished reviewing the code for completeness. I did not find errors for sweep/add_support_for_turbopuffer
.
💡 To recreate the pull request edit the issue title or description. Something wrong? Let us know.
This is an automated message generated by Sweep AI.
Documentation for Turbopuffer sdk: https://turbopuffer.com/docs/
add turbopuffer[fast] to requirements.txt
Upsert code:
ns = tpuf.Namespace('namespace-name')
If an error occurs, this call raises a tpuf.APIError if a retry was not successful.
ns.upsert( ids=[1, 2, 3, 4], vectors=[[0.1, 0.1], [0.2, 0.2], [0.3, 0.3], [0.4, 0.4]], attributes={ 'my-string': ['one', None, 'three', 'four'], 'my-uint': [12, None, 84, 39], 'my-string-array': [['a', 'b'], ['b', 'd'], [], ['c']] distance_metric='cosine_distance' )
import turbopuffer as tpuf
ns = tpuf.Namespace('namespace-name')
Cursor paging is handled automatically by the Python client
If an error occurs, this call raises a tpuf.APIError if a retry was not successful.
for row in ns.vectors(): print(row)
VectorRow(id=1, vector=[0.1, 0.1], attributes={'key1': 'one', 'key2': 'a'})
VectorRow(id=2, vector=[0.2, 0.2], attributes={'key1': 'two', 'key2': 'b'})
VectorRow(id=3, vector=[0.3, 0.3], attributes={'key1': 'three', 'key2': 'c'})
VectorRow(id=4, vector=[0.4, 0.4], attributes={'key1': 'four', 'key2': 'd'})