msgpack / msgpack-python

MessagePack serializer implementation for Python msgpack.org[Python]
https://msgpack.org/
Other
1.9k stars 229 forks source link

Distribute a pure-Python build #576

Closed jvolkman closed 6 months ago

jvolkman commented 9 months ago

I've seen a number of issues (usually newer versions of Python are different architectures) where the solution is to set MSGPACK_PUREPYTHON=1 to build a pure-Python wheel. I'm currently working on a use-case where I can only support pure-Python wheels, and msgpack is the only package in my transitive set that doesn't provide one.

Given that a pure-Python version can be built, it would be nice if it was also distributed via pypi. pip should prefer the platform-specific native versions if one is found to be compatible, and only fall back to the pure-python version if not.

methane commented 9 months ago

It is not msgpack specific problem. All libraries with speedup module have same issue. Do you know any other major library do it successfully?

jvolkman commented 9 months ago

Hi @methane,

I agree that the issue of extension module incompatibility isn't specific to msgpack. All packages that ship native modules deal with this. But what is relatively special to msgpack is that it can optionally build a fallback, pure-python wheel.

I've found a handful of other popular libraries with a similar situation.

  1. simplejson - perhaps most relevant to this project - ships a fallback wheel which you can see here. Its README says "It is pure Python code with no dependencies, but includes an optional C extension for a serious speed boost". You can see here in its github workflow where it builds the pure-python wheel outside of cibuildwheel.
  2. SQLAlchemy - a very popular Python ORM (pypi)
  3. Older versions of pydantic before 2.x (pypi)

There are others. If you're curious, you can query this dataset with something like

select distinct (project_name, project_version)
from pypi p1 join pypi p2 using (project_name, project_version)
where position(p1.project_release, 'py3-none-any') != 0
and splitByChar('-', p2.project_release)[4] = 'cp311'
order by project_name, project_version
methane commented 6 months ago

I tried it but I conclude this is bad idea for msgpack. Performance is must have for some msgpack users. And MSGPACK_PUREPYTHON opt-in is easier than avoiding pure Python wheel.