coleifer / pysqlite3

SQLite3 DB-API 2.0 driver from Python 3, packaged separately, with improvements
zlib License
183 stars 51 forks source link

Provide source and pre-built binaries in single PyPI entry #71

Closed scottcode closed 7 months ago

scottcode commented 7 months ago

Thanks for sharing this useful project to the open-source community!

Currently, this project populates the Python Package Index (PyPI) with two separate "packages": pysqlite3 which only contains the source distribution, and pysqlite3-binary which only contains pre-built binaries and not the source distribution.

A typical pattern for many other packages is to provide both source and binary distributions in the same PyPI entry, because then pip natively tries to use an appropriate binary if available but falls back to building from the source distribution if not. It would be helpful if this project followed that approach, especially since not all operating systems and CPU architectures are covered. As it stands, my project wants to use pysqlite3, but development and deployments happen on different kinds of machines and OSs, so we need to either only use the source package spec or have some complex handling to handle if/when to specify pysqlite3 vs pysqlite3-binary at installation time.

coleifer commented 7 months ago

The reason there are two packages is right there in the readme: https://github.com/coleifer/pysqlite3?tab=readme-ov-file#using-the-binary-package

The pysqlite3 package builds a package that can either link against your system libsqlite3 or you can do an amalgamation/statically-linked build.

The pysqlite3-binary package always builds a self-contained / statically-linked extension module.

scottcode commented 7 months ago

@coleifer, Thanks for getting back on this. The readme is still a little unclear when it comes to the behavior of pip installing. While it is clearly stated that it's possible to pip install pysqlite3-binary and you would get a statically linked version, the readme does not explicitly describe a pip install pysqlite3 option. The non-binary options instead describe cloning the repo and running python setup.py build variants.

For anyone who wants to ensure that what they are getting is a custom binary they are configuring and building themselves, I would argue they would go the route of cloning and doing those local build steps instead of running pip install pysqlite3. Moreover, the building with statically-linked library option can't be done directly as part of pip install pysqlite3 anyway.

People who are using a pip install ... approach probably want something that just works, and they don't (yet) care what version of sqlite is linked. (If and once they do care, then they'd clone and build.) If you add the pre-built binaries to the main pysqlite3 package, then the package would be more readily usable by more non-power users under more circumstances. Besides, pip has --no-binary and --binary-only arguments that would allow even power users to achieve the same control as in the current split-package setup. (--no-binary docs). I think combined would make the package more accessible to more users.

coleifer commented 7 months ago

The purpose of this project is to provide advanced users with a bit of additional control for how they use Sqlite from Python. I don't know why someone would want to blithely "pip install pysqlite3" when the stdlib sqlite3 exists, unless they were already knowledgeable about the functionality of the stdlib implementation, or they already understood that they wanted to do a special build of some kind.

scottcode commented 7 months ago

In that case, sounds like most (advanced) users of pysqlite3 wouldn't be using the pip install pysqlite3-binary either. They'd always be building from source, and most likely not via pip at all. We might be outliers, so I get if this thread doesn't rise to the level of motivating any changes. I'll describe how we came to your package in case it helps.

The reason my project started considering pysqlite3 was this troubleshooting tip from Chroma DB. It billed pysqlite3-binary as a way of ensuring you're using a more current version of sqlite in case of compatibility errors. That solved the errors we were facing, but only when running on Linux and x86_64. For our Windows and Apple Silicon environments it didn't work since the pysqlite3-binary doesn't cover them. The only option for consistency would be to switch to pysqlite3, but then it would require the local binary build even when running on OSs and architectures that have a prebuilt binary in the other package listing. It's possible to manage as long as all environments have a compatible compiler installed and configured, though it does slow down python environment builds.

Our situation may deviate from the core audience you envisioned, @coleifer. Just wanted to call out the extra cross-platform challenge created by having separate PyPI packages in situations like ours. My recommendation would be to add the binaries to the main pysqlite3 listing, but obviously you and other maintainers are the ones with skin in the game. We haven't gone too far in entrenching pysqlite3 as a dependency, so my project can either remove it as a dependency or find some other way to make it manageable across platforms.