Open semihsalihoglu-uw opened 6 months ago
I looked into DuckDB a bit. They install extensions to ~/.duckdb/extensions/{DuckDB Version}/{System Architecture}/{Extension Name}.duckdb_extension
.
When the user tries to load a extension, the released DuckDB binary looks for the extension based on {DuckDB Version}
and {System Architecture}
. If a corresponding one is found it can be loaded. When the user bumps their DuckDB version they always have to reinstall all their extensions.
I think we do need to keep track of the version of locally-installed extensions. Instead of installing to ~/.kuzu/extension/
, we should install to ~/.kuzu/extension/{Extension Version}/{System Architecture}
.
But I am not sure whether we should make the extension version the same as kuzu version or use the current approach of having a separate version string. The advantage of our current approach is that for minor releases, the users may not need to reinstall the extensions.
For extensions released for internal testings, let's make a rule of having a dev-x
prefix. I'll periodically purge them (other than the latest one) from the repo.
Added extension version to local path in #3354
For extensions released for internal testings, let's make a rule of having a
dev-x
prefix. I'll periodically purge them (other than the latest one) from the repo.
I checked duckdb, they don't have repo to host extensions for dev build.
Mechanism to actually validate that the right extension version is installed: More importantly, we currently do not have any mechanism to check if an extension version is valid or not for the running Kuzu version. The -DKUZU_EXTENSION_VERSION flag seems only to be used when installing an extension and not for when checking if an extension already exists. This I say because it looks like there is a single directory ${xyz}/extension under which we store all the extension binaries. So the directory into which we store extensions do not have version numbers in the directory name. Further the extension binaries do not have version numbers on them (e.g., regardless of the extension version number, all https extension binaries have the name libhttpfs.kuzu_extension. So we'll get unclear error messages like symbol not found when blindly dynamically linking against these. Chang tested this and can expand on this.
I think this would be pretty hard. I don't think there is a mechanism that allows us to check the library version before actually loading the lib. @mewim Do you have any ideas on check the version of lib?
Mechanism to actually validate that the right extension version is installed: More importantly, we currently do not have any mechanism to check if an extension version is valid or not for the running Kuzu version. The -DKUZU_EXTENSION_VERSION flag seems only to be used when installing an extension and not for when checking if an extension already exists. This I say because it looks like there is a single directory ${xyz}/extension under which we store all the extension binaries. So the directory into which we store extensions do not have version numbers in the directory name. Further the extension binaries do not have version numbers on them (e.g., regardless of the extension version number, all https extension binaries have the name libhttpfs.kuzu_extension. So we'll get unclear error messages like symbol not found when blindly dynamically linking against these. Chang tested this and can expand on this.
I think this would be pretty hard. I don't think there is a mechanism that allows us to check the library version before actually loading the lib. @mewim Do you have any ideas on check the version of lib?
After changing path to ~/.kuzu/extension/{Extension Version}/{System Architecture}
, kuzu should only load extensions that it can load and will throw an error for extension not found if it cannot find corresponding version. This behavior is the same as DuckDB.
We need to think through how we will maintain extension versions and require users to reinstall them as they bump up or down their Kuzu versions. Specifically we need to make the following decisions:
add_definitions(-DKUZU_EXTENSION_VERSION="0.2.6")
. So our binary knows about a single "global extension version". An alternative to this could be to have "extension-specific-versions". That is we have different extension versions for different extensions. For example, httpfs could require v 0.1.0 while postgres scanner could require v 0.2.6. Both options have pros and cons. A single global extension version means that as users change their Kuzu version, they would be forced to reinstall every extension, which may be OK, even if the previous extension binary they had is the same binary. This also means that we rebuild and re-release every extension binary in each of our non-minor releases. Extension-specific-version would only require users to reinstall an extension only if it's strictly necessary and the previous extension version will not work with the new Kuzu version.${xyz}/extension
under which we store all the extension binaries. So the directory into which we store extensions do not have version numbers in the directory name. Further the extension binaries do not have version numbers on them (e.g., regardless of the extension version number, all https extension binaries have the namelibhttpfs.kuzu_extension
. So we'll get unclear error messages likesymbol not found
when blindly dynamically linking against these. Chang tested this and can expand on this.We need to make decisions on these points. We should do our research and look into what other systems are doing and try to make a more informed decision.