substrait-io / duckdb-substrait-extension

MIT License
28 stars 22 forks source link

Nightly Distribution #108

Closed pdet closed 2 months ago

pdet commented 2 months ago

This will allow for the substrait extension to be built and distributed for every push on main.

It currently is tagged with DuckDB v1.1, and eventually we should also allow it for DuckDB main.

Using DuckDB v1.1, you should be able to get the substrait main branch extension with:

FORCE INSTALL substrait FROM core_nightly
pdet commented 2 months ago

cc @epsilonprime, this should allow for testing on the main branch of substrait with the v1.1 of DuckDB.

We still have to do some tweaks to release also for the dev version of DuckDB (which will matter for v1.2, and other major versions), but that should already facilitate testing for v1.1 and v1.1.1

EpsilonPrime commented 2 months ago

Thanks, I'll try it out!

EpsilonPrime commented 2 months ago

So to use with Python I should do something like the following?

        self._connection = duckdb.connect(
            config={
                "max_memory": "100GB",
                "allow_unsigned_extensions": "true",
                "autoinstall_extension_repository": "http://nightly-extensions.duckdb.org",
                "custom_extension_repository": "http://nightly-extensions.duckdb.org",
                "temp_directory": str(Path(".").resolve()),
            }
        )
        self._connection.install_extension("substrait")
        self._connection.load_extension("substrait")
pdet commented 2 months ago

So to use with Python I should do something like the following?

        self._connection = duckdb.connect(
            config={
                "max_memory": "100GB",
                "allow_unsigned_extensions": "true",
                "autoinstall_extension_repository": "http://nightly-extensions.duckdb.org",
                "custom_extension_repository": "http://nightly-extensions.duckdb.org",
                "temp_directory": str(Path(".").resolve()),
            }
        )
        self._connection.install_extension("substrait")
        self._connection.load_extension("substrait")

@carlopi do you know how to set the extensions repo through the connection config?

I confess I tested it with:

import duckdb
con = duckdb.connect()
con.execute("FORCE INSTALL substrait FROM core_nightly")
con.execute("LOAD substrait")
carlopi commented 2 months ago

@Tishj improved the install_extension function in this PR: https://github.com/duckdb/duckdb/pull/13876, but that's not part of 1.1.0 since he can't time travel (yet).

At the current moment the only supported way is via SQL, as suggested by @pdet:

con.execute("FORCE INSTALL substrait FROM core_nightly")

Note that extension INSTALLs (and LOADs ?) are not really transaction aware, so it should be equivalent (I think) if performed at the connection level or in the duckdb main object.

Semantic is roughly (but please @pdet do correct / edit what's wrong):

EpsilonPrime commented 2 months ago

Thanks! I'll use that method for now.

Tishj commented 2 months ago

After the mentioned PR lands, this should be possible through

con.install_extension('substrait', repository='core_nightly', force_install=True)

Which is entirely equivalent to:

con.execute("FORCE INSTALL substrait FROM core_nightly")