Open asg017 opened 1 year ago
cc @trodrigu
For more context: The exqlite lib similar to the python lib can load extensions like
alias Exqlite.Basic
{:ok, conn} = Basic.open("db.sqlite3")
:ok = Basic.enable_load_extension(conn)
# load the regexp extension - https://github.com/nalgeon/sqlean/blob/main/docs/re.md
Basic.load_extension(conn, ExSqlean.path_for("re"))
To make this work with ecto we have to do something like
config :sqlite_init, SqliteInit.Repo,
database: Path.expand("../sqlite_init_dev.db", Path.dirname(__ENV__.file)),
pool_size: 5,
after_connect: fn _conn ->
[db_conn] = Process.get(:"$callers")
db_connection_state = :sys.get_state(db_conn)
conn = db_connection_state.mod_state.state
IO.inspect(conn, label: "conn")
:ok = Exqlite.Basic.enable_load_extension(conn)
Exqlite.Basic.load_extension(conn, ExLitedb.path_for("re"))
end,
stacktrace: true,
show_sensitive_data_on_connection_error: true
@asg017 Are we feeling that loading the extension into ecto (the defacto Elixir orm) is outside the scope of what the lib should do?
priv
is great for storing the compiled files. Exsqlean takes this approach.
The thing we can update from that library though is changing the "GET" request it does for the compiled files and bundle them in the package.
Thanks for the notes @trodrigu ! Some thoughts:
:code.priv
approach in Exsqlean
looks perfectExsqlean
bundles all platform extensions under priv/
, darwin-amd64
darwin-arm64
linux-amd64
windows-amd64
and windows-win32
. I don't think this will be ideal for sqlite-vss: Each compiled extension is like ~4MB, so I'd like to have separate Elixir packages for each platform, if possible.sqlite-vss
pypi package that download different platform-specifc wheels when someone pip installs, and I'm wondering if Elixir has a similar pattern. Ruby has something similar, the sqlite3
gem has different "version" for different platforms, like sqlite3-1.6.2-x86_64-linux.gem
, sqlite3-1.6.2-x86_64-darwin.gem
, etc.As far as next steps: If you put up a PR that just gets the skeleton of what the mix.exs
and the library Elixir code looks like, I can pick that up and figure out all the priv/
bundling. It'll be done in Github Actions (so nothing like the Makefile in ex_sqlean), so you can just assume that the platform-specific loadable extensions will be bundled in on publish.
For practicality, I may copy the Elixir code and test it with sqlite-hello
at first: Each release of sqlite-vss
takes like 20 minutes to build and release, but sqlite-hello
is much more slimed down and takes ~15 seconds. And every typo/fix needs to be it's own separate release, so it takes a few tries to get right.
Let me know what you think!
I love this plan! ❤️ I'm posting a couple of places to see if this is possible.
Hey @asg017 !
I wanted to send you a reference to the Elixir Slack link where I’m asking about packages and architectures. Feel free to join at elixir-lang.slack.com. The gist of it is that the recommended pattern is to go with downloadable binaries and use certfile validation and checksum validation. Elixir even has a special library called Rustler Precompiled which makes this process simpler for Rust crates. An example of the checksum file (which is present in the package) is at https://preview.hex.pm/preview/explorer/0.5.7/show/checksum-Elixir.Explorer.PolarsBackend.Native.exs. The bundled package route would require users to add some code to the deps()
function in the mix.exs
file which could be a little unwieldy. I wanted to get your opinion on this as we may be all good to go with downloadable binaries.
Thanks for the notes @trodrigu! The download at install-time + cerfile verification approach sounds good to me.
So I imagine we'll have some checksum.exs
file that looks like this:
%{
"sqlite-vss-elixir-macos-x86_64.tar.gz" => "sha256:d36805150e...2c3d25c7e",
"sqlite-vss-elixir-macos-aarch.tar.gz" => "sha256:50383f244c...50383f244c",
"sqlite-vss-elixir-linux-x86_64.tar.gz" => "sha256:3e61118a...8443b34d"
}
Some questions:
"GET"
request to download these packages happen? Is it when a user adds sqlite_vss
to their deps()
and runs mix hex.install
, or at runtime when sqlite_vss
is ran for the first time?If you're able to get a skeleton elixir package PR up, don't worry about the checksums.exs
file, I think we can generate that at release-time in a github actions runner. But I'm not sure what the mix.exs
or the lib/sqlite_vss.ex
files should look like
When exactly does the "GET" request to download these packages happen? Is it when a user adds sqlite_vss to their deps() and runs mix hex.install, or at runtime when sqlite_vss is ran for the first time?
The former where a user will run a mix sqlite_vss.install
command.
Likewise, does the checksum check also happen at install-time?
Exactly! The approach would emulate the Rustler precompiled project.
For the actual GET request, does that require any additional dependencies, or is there one built into elixir we can use? And is certfile a builtin package?
The approach will use :httpc
which is built in to Elixir. For the cacertfile we can use the :castore
library which makes keeping your castore up to date easier.
For reference https://github.com/asg017/sqlite-vss/pull/33
For more context: The exqlite lib similar to the python lib can load extensions like
alias Exqlite.Basic {:ok, conn} = Basic.open("db.sqlite3") :ok = Basic.enable_load_extension(conn) # load the regexp extension - https://github.com/nalgeon/sqlean/blob/main/docs/re.md Basic.load_extension(conn, ExSqlean.path_for("re"))
To make this work with ecto we have to do something like
config :sqlite_init, SqliteInit.Repo, database: Path.expand("../sqlite_init_dev.db", Path.dirname(__ENV__.file)), pool_size: 5, after_connect: fn _conn -> [db_conn] = Process.get(:"$callers") db_connection_state = :sys.get_state(db_conn) conn = db_connection_state.mod_state.state IO.inspect(conn, label: "conn") :ok = Exqlite.Basic.enable_load_extension(conn) Exqlite.Basic.load_extension(conn, ExLitedb.path_for("re")) end, stacktrace: true, show_sensitive_data_on_connection_error: true
@asg017 Are we feeling that loading the extension into ecto (the defacto Elixir orm) is outside the scope of what the lib should do?
Just wanted to point this PR out ... I actually committed this back to exqlite specifically to enable usage of this lib in a more convenient way. The after_connect stuff is not needed, just a load_extensions
key at the repo or global config. If the package directly exposes a path to the embedded libs, this becomes as simple as adding the hex package, and adding the load_extensions block with the helper as the path. See this comment for an example. I was originally thinking about shipping a hex package for exactly the same purpose.
Haven't been through the full comment thread here yet, apologies if I missed something.
@bkono Thanks for pointing this out! load_extensions
is the way to go and I'll be refactoring my production app to use that API. @asg017 I wasn't sure if we wanted to keep this as bare bones as possible and not include a reference to Ecto in the README?
@trodrigu we can include an Ecto reference/code snippet in the README!
kk @trodrigu #33 is merged and sqlite_vss
is now published, can you let me know if it works as expected?
A few more lingering questions I have:
def install()
from lib/sqlite_vss.ex
to mix/tasks/sqlite_vss.install.ex
? I'd ideally like to keep sqlite_vss.ex
as small as possible and only contain def loadable_path_vector0()
and def loadable_path_vss0()
, but I'm not sure if the rest has to be in sqlite_vss.ex
sqlite-vss-checksum.exs
checksum checks at install-time?Does it make sense to move def install() from lib/sqlite_vss.ex to mix/tasks/sqlite_vss.install.ex? I'd ideally like to keep sqlite_vss.ex as small as possible and only contain def loadable_path_vector0() and def loadable_path_vss0(), but I'm not sure if the rest has to be in sqlite_vss.ex
Yes, this would be a worthy refactor. It makes sense to move the install related functions into the mix task file.
Do you think it's worth doing the sqlite-vss-checksum.exs checksum checks at install-time?
After reading through elixir_make I think it should be easy to do this actually. Another worthy refactor.
The published sqlite_vss package gets installed successfully in my production app on fly.io. Great work!
I'd like to bundle
sqlite-vss
as an Elixir package and distribute it on hex. Ideally Elixir developers should be able to run:Existing Language Binding notes
The Python/Node.js bindings (and the future Ruby binding I'm working on) all do similar things:
.dylib
/.so
/.dll
) inside the language package (the.wheel
file for python, the bundled.tar.gz
package for Node.js, and the.gem
file for Ruby.gem
files withgem bundle --platform x86_64-darwin
etcloadable_path
function that returns the path to the compiled extension. Also maybe aload()
function that wraps a sqlite3 db connection that just callsdb.load_extension(loadable_path())
like the python apiOpen questions
sqlite-vss
package that has different "wheels" for all our platforms, or do we need to publish separatesqlite-vss-macos-x86_64
/sqlite-vss-linux-x86_64
packages for every platform?def project
? This seems to work for me:Specifically the
files: ["priv/*.dylib"]
part. How can we make it work for multiple platforms?