Open franciscojavierarceo opened 3 months ago
@asg017 happy to take this on if you're good with it
It now should work in all Python versions, if you update your KNN SQL queries to look like this:
select rowid, distance
from vec_items
where embedding match ?
and k = 10
Instead of:
select rowid, distance
from vec_items
where embedding match ?
limit 10
The limit 10
syntax only works in SQLite versions 3.41+, which older Python versions typically dont have. But the k = 10
syntax should work on all versions of SQLite
I'm a bit hesitant to add a CI rule to test across multiple Python versions, and that can slow down the CI quite a lot. But if the k = 10
syntax doesn't work for you, happy to dig into it further!
Yeah I swapped that syntax but still encountered issues.
Here's the PR I have https://github.com/feast-dev/feast/pull/4333
I tried some changes and now it fails on 3.10 mac instead of 3.11 as well as Ubuntu.
It feels a bit like wack-a-mole which is why I thought adding the CI would help me. I can just make a fork I suppose and have the CI in mine.
The error that's being raised is
E sqlite3.OperationalError: no such module: vec0
So I tried the latest version (v0.1.0) and created a simple example with the workflow below and there are some interesting issues. It looks like now these issues are associated with build time errors instead of code failures like I reported in the first run.
For what it's worth, here's the current list of OSs where things are failing (mostly installing)
I briefly looked at your test.yaml
workflow and noticed you're building python a bit differently so I'll try to see if it has something to do with using actions/setup-python
.
name: unit-tests
on:
pull_request:
push:
branches:
- main
jobs:
unit-test-python:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
python-version: [ "3.9", "3.10", "3.11", "3.12"]
os: [ ubuntu-latest, macos-13, macos-latest ]
exclude:
- os: macos-13
python-version: "3.9"
env:
OS: ${{ matrix.os }}
PYTHON: ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v4
- name: Setup Python
id: setup-python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
architecture: x64
- name: Install uv
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
- name: Get uv cache dir
id: uv-cache
run: |
echo "::set-output name=dir::$(uv cache dir)"
- name: Install dependencies
run: pip install sqlite_vec==v0.1.0
- name: run script
run: python sqlite_vec_demo.py
And the sqlite_vec_demo.py
file is just:
import sqlite3
import sqlite_vec
from typing import List
import struct
def serialize_f32(vector: List[float]) -> bytes:
"""serializes a list of floats into a compact "raw bytes" format"""
return struct.pack("%sf" % len(vector), *vector)
def main() -> None:
db = sqlite3.connect(":memory:")
db.enable_load_extension(True)
sqlite_vec.load(db)
db.enable_load_extension(False)
sqlite_version, vec_version = db.execute(
"select sqlite_version(), vec_version()"
).fetchone()
print(f"sqlite_version={sqlite_version}, vec_version={vec_version}")
items = [
(1, [0.1, 0.1, 0.1, 0.1]),
(2, [0.2, 0.2, 0.2, 0.2]),
(3, [0.3, 0.3, 0.3, 0.3]),
(4, [0.4, 0.4, 0.4, 0.4]),
(5, [0.5, 0.5, 0.5, 0.5]),
]
query = [0.3, 0.3, 0.3, 0.3]
db.execute("CREATE VIRTUAL TABLE vec_items USING vec0(embedding float[4])")
with db:
for item in items:
db.execute(
"INSERT INTO vec_items(rowid, embedding) VALUES (?, ?)",
[item[0], serialize_f32(item[1])],
)
rows = db.execute(
"""
SELECT
rowid,
distance
FROM vec_items
WHERE embedding MATCH ?
and k = 3
""",
[serialize_f32(query)],
).fetchall()
print(rows)
if __name__ == "__main__":
main()
Looking at these logs: https://github.com/franciscojavierarceo/Python/actions/runs/10241600790/job/28330119760
Nearly all of the failure have to do with installing Python on github actions runners, and not with sqlite-vec
.
But these fail with AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension'
https://github.com/franciscojavierarceo/Python/actions/runs/10241600790/job/28330119760
For that: this is a MacOS thing, where recent MacOS versions block loading SQLite extensions on default Python builds. You'll need to use homebrew to install a new Python version that bundles its own SQLite build that allows extensions loading (or some other Python installer, actions/setup-python
wont do this for you)
Yeah, that's what I was suggesting, too. Thanks for digging in as well.
I'll raise an issue with SQLite and tag it here. I've renamed this issue so it's more explicit in case someone else tries to do something similar.
I launched support for SQLite Vec in a recent version of Feast but, due to some CI issues, only released it to a subset of Python versions.
I was considering contributing to this project to add a CI to verify the Python package behavior. It would also help the Feast support.
The solution would add a github/worfklow with something like this: