python-trio / trio-websocket

WebSocket client and server implementation for Python Trio
MIT License
70 stars 26 forks source link

install fails under PyPy #119

Closed belm0 closed 5 years ago

belm0 commented 5 years ago

~I believe the wsaccel dependency pulls in autobahn, which in turn fails under PyPy due to https://github.com/crossbario/autobahn-python/issues/968.~

The wsaccel dependency fails under PyPy:

$ pip_pypy3 install trio_websocket
Collecting trio_websocket
  Downloading https://files.pythonhosted.org/packages/1d/5b/40c701f1ddadf1b8e055b2378fe992acd43de1c3e72a07807f327fe5f390/trio-websocket-0.7.0.tar.gz
Requirement already satisfied: async_generator<2,>=1.10 in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from trio_websocket) (1.10)
Collecting ipaddress<2,>=1.0.22 (from trio_websocket)
  Using cached https://files.pythonhosted.org/packages/fc/d0/7fc3a811e011d4b388be48a0e381db8d990042df54aa4ef4599a31d39853/ipaddress-1.0.22-py2.py3-none-any.whl
Requirement already satisfied: trio>=0.11 in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from trio_websocket) (0.11.0)
Collecting wsaccel<0.7,>=0.6.2 (from trio_websocket)
  Using cached https://files.pythonhosted.org/packages/52/46/9ef0744b434ac723ae3aeddcb92ab29af14f8912f53f47590b5a0db0b9d6/wsaccel-0.6.2.tar.gz
Collecting wsproto<0.15,>=0.14 (from trio_websocket)
  Downloading https://files.pythonhosted.org/packages/6a/e2/d69646c06cc970393e23e7506cb7ec3cec98e1f54ca5c5fac6efc15f67de/wsproto-0.14.1-py2.py3-none-any.whl
Collecting yarl<2,>=1.2.6 (from trio_websocket)
  Using cached https://files.pythonhosted.org/packages/fb/84/6d82f6be218c50b547aa29d0315e430cf8a23c52064c92d0a8377d7b7357/yarl-1.3.0.tar.gz
Requirement already satisfied: attrs>=18.2.0 in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from trio>=0.11->trio_websocket) (19.1.0)
Requirement already satisfied: sortedcontainers in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from trio>=0.11->trio_websocket) (2.1.0)
Requirement already satisfied: idna in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from trio>=0.11->trio_websocket) (2.8)
Requirement already satisfied: outcome in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from trio>=0.11->trio_websocket) (1.0.0)
Requirement already satisfied: sniffio in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from trio>=0.11->trio_websocket) (1.1.0)
Requirement already satisfied: contextvars>=2.1; python_version < "3.7" in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from trio>=0.11->trio_websocket) (2.4)
Collecting h11>=0.8.1 (from wsproto<0.15,>=0.14->trio_websocket)
  Downloading https://files.pythonhosted.org/packages/5a/fd/3dad730b0f95e78aeeb742f96fa7bbecbdd56a58e405d3da440d5bfb90c6/h11-0.9.0-py2.py3-none-any.whl (53kB)
    100% |████████████████████████████████| 61kB 1.4MB/s
Collecting multidict>=4.0 (from yarl<2,>=1.2.6->trio_websocket)
  Downloading https://files.pythonhosted.org/packages/7f/8f/b3c8c5b062309e854ce5b726fc101195fbaa881d306ffa5c2ba19efa3af2/multidict-4.5.2.tar.gz (105kB)
    100% |████████████████████████████████| 112kB 4.1MB/s
Requirement already satisfied: immutables>=0.9 in /usr/local/Cellar/pypy3/7.0.0/libexec/site-packages (from contextvars>=2.1; python_version < "3.7"->trio>=0.11->trio_websocket) (0.9)
Installing collected packages: ipaddress, wsaccel, h11, wsproto, multidict, yarl, trio-websocket
  Running setup.py install for wsaccel ... error
    Complete output from command /usr/local/Cellar/pypy3/7.0.0/bin/pypy3 -u -c "import setuptools, tokenize;__file__='/private/var/folders/ww/51xbs7nx6t16x7mh225q9x1h0000gn/T/pip-install-dc6tdoqk/wsaccel/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /private/var/folders/ww/51xbs7nx6t16x7mh225q9x1h0000gn/T/pip-record-8umwani7/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build/lib.macosx-10.13-x86_64-3.6
    creating build/lib.macosx-10.13-x86_64-3.6/wsaccel
    copying wsaccel/__init__.py -> build/lib.macosx-10.13-x86_64-3.6/wsaccel
    running build_ext
    building 'wsaccel.utf8validator' extension
    creating build/temp.macosx-10.13-x86_64-3.6
    creating build/temp.macosx-10.13-x86_64-3.6/wsaccel
    gcc -pthread -arch x86_64 -DNDEBUG -O2 -fPIC -I/usr/local/Cellar/pypy3/7.0.0/libexec/include -c wsaccel/utf8validator.c -o build/temp.macosx-10.13-x86_64-3.6/wsaccel/utf8validator.o
    wsaccel/utf8validator.c:235:11: warning: 'PyMethod_New' macro redefined [-Wmacro-redefined]
      #define PyMethod_New(func, self, klass) ((self) ? PyMethod_New(func, self) : PyInstanceMethod_New(func))
              ^
    /usr/local/Cellar/pypy3/7.0.0/libexec/include/pypy_decl.h:487:9: note: previous definition is here
    #define PyMethod_New PyPyMethod_New
            ^
    wsaccel/utf8validator.c:983:49: warning: passing 'void **' to parameter of type 'const void **' discards qualifiers in nested pointer types [-Wincompatible-pointer-types-discards-qualifiers]
      __pyx_t_1 = PyObject_AsReadBuffer(__pyx_v_ba, ((void **)(&__pyx_v_buf)), (&__pyx_v_buf_len)); if (unlikely(__pyx_t_1 == -1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 104; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
                                                    ^~~~~~~~~~~~~~~~~~~~~~~~~
    /usr/local/Cellar/pypy3/7.0.0/libexec/include/object.h:333:64: note: passing argument to parameter here
    PyAPI_FUNC(int) PyObject_AsReadBuffer(PyObject *, const void **, Py_ssize_t *);
                                                                   ^
    2 warnings generated.
    gcc -pthread -shared -undefined dynamic_lookup build/temp.macosx-10.13-x86_64-3.6/wsaccel/utf8validator.o -o build/lib.macosx-10.13-x86_64-3.6/wsaccel/utf8validator.pypy3-70-darwin.so
    clang: warning: argument unused during compilation: '-pthread' [-Wunused-command-line-argument]
    building 'wsaccel.xormask' extension
    gcc -pthread -arch x86_64 -DNDEBUG -O2 -fPIC -I/usr/local/Cellar/pypy3/7.0.0/libexec/include -c wsaccel/xormask.c -o build/temp.macosx-10.13-x86_64-3.6/wsaccel/xormask.o
    wsaccel/xormask.c:235:11: warning: 'PyMethod_New' macro redefined [-Wmacro-redefined]
      #define PyMethod_New(func, self, klass) ((self) ? PyMethod_New(func, self) : PyInstanceMethod_New(func))
              ^
    /usr/local/Cellar/pypy3/7.0.0/libexec/include/pypy_decl.h:487:9: note: previous definition is here
    #define PyMethod_New PyPyMethod_New
            ^
    wsaccel/xormask.c:2165:36: error: no member named 'curexc_traceback' in 'struct _ts'
            PyObject* tmp_tb = tstate->curexc_traceback;
                               ~~~~~~  ^
    wsaccel/xormask.c:2168:21: error: no member named 'curexc_traceback' in 'struct _ts'
                tstate->curexc_traceback = tb;
                ~~~~~~  ^
    1 warning and 2 errors generated.
    error: command 'gcc' failed with exit status 1

    ----------------------------------------
Command "/usr/local/Cellar/pypy3/7.0.0/bin/pypy3 -u -c "import setuptools, tokenize;__file__='/private/var/folders/ww/51xbs7nx6t16x7mh225q9x1h0000gn/T/pip-install-dc6tdoqk/wsaccel/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /private/var/folders/ww/51xbs7nx6t16x7mh225q9x1h0000gn/T/pip-record-8umwani7/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/ww/51xbs7nx6t16x7mh225q9x1h0000gn/T/pip-install-dc6tdoqk/wsaccel/
njsmith commented 5 years ago

Judging from the error message, I don't think this has anything to do with autobahn... that issue you linked to just notes that wsaccel is broken on pypy, as part of the justification for autobahn switching away from using wsaccel.

It looks like there are two problems:

In the short term, I think the solution is just...... don't use wsaccel? wsproto works fine without wsaccel; it just falls back on a pure-python implementation of XOR-masking. Which on PyPy might work fine anyway, so this might also be the long-term solution :-). In fact the autobahn source code claims that their pure-Python XOR masking code is faster on PyPy than wsaccel's "optimized" code. Their code is significantly fancier than wsproto's fallback code, so I don't know how fast that goes on PyPy.

Recommendation: tweak trio-websocket's install_requires so that it only installs wsaccel when running on CPython. Something like: "wsaccel >=0.6.2, <0.7; implementation_name == 'cpython'"

belm0 commented 5 years ago

FWIW, last time I checked my app with and without wsaccel I didn't discover a difference: https://github.com/HyperionGray/trio-websocket/issues/32#issuecomment-423180088

I can't believe that an apparently little-used, unmaintained package is still relevant to Python ws performance.

njsmith commented 5 years ago

I guess you're only dealing with small messages? The websocket protocol requires XOR-masking on every client→server packet, and if you're doing it in Python then this means looping over bytes, which has absurdly high overhead on CPython. On my laptop with 10 KB messages, wsproto's Python fallback burns ~1 ms of CPU time per message just for masking, but with wsaccel this drops to 17 µs. That's a ~60x speedup. Or put another way, without wsaccel, a fast modern CPU can't hope to push more than ~10 MB/s of websocket traffic, and that's if it's doing nothing else at all.

If we don't want a little-used, unmaintained package to be relevant to Python ws performance, then we need to like, maintain it or make a replacement or something :-). It won't magically become irrelevant on its own.

njsmith commented 5 years ago

Did a bit of followup reading... I guess part of why wsaccel is somewhat unmaintained is that it's kinda slow and projects like aiohttp and tornado have switched to their own private speedup modules:

image

That graph's from this blog post, which shows how to at least get the orange and green lines using pure Python. So wsproto could get surprisingly close to wsaccel speed using pure Python; or if it used a better-tuned native extension then it could go roughly one order of magnitude faster than that.

Native extensions are a major hassle to maintain (you have to make wheels etc.), so I'm not sure if it's worth bothering. If we do then tornado's version seems simple and clear:

https://github.com/tornadoweb/tornado/blob/master/tornado/speedups.c

If I were doing it for real I'd probably use cffi to wrap that core logic. And maybe follow chromium in adding a version that follows the same pattern but uses int __attribute__((vector_size (16))), for a modest SIMD speedup. (chromium code, docs for __attribute__((vector_size)))

belm0 commented 5 years ago

Thank you-- I'll have to double check wsaccel on my app. I'll also file the suggestion to wsproto.

belm0 commented 5 years ago

Native extensions are a major hassle to maintain (you have to make wheels etc.), so I'm not sure if it's worth bothering.

I'd consider having an optional numba implementation for the CPython case.

njsmith commented 5 years ago

Numba pulls in numpy and llvm. Altogether it's about 40 MB download, 164 MB on disk. This seems excessive for a single 10 line C function.

Actually numpy on its own can probably handle the masking pretty well, but I don't think people would want to pull in numpy just for this.

belm0 commented 5 years ago

Altogether it's about 40 MB download, 164 MB on disk

Thank you, hadn't considered that. Where I've used numba before the project already had the dependencies.