django / asgi_ipc

IPC-based ASGI channel layer
BSD 3-Clause "New" or "Revised" License
37 stars 13 forks source link

Unable to use layer inside docker container. #26

Closed proofit404 closed 7 years ago

proofit404 commented 7 years ago

This is a really weird situation.

I hit this thing while trying to fix asgi_rabbitmq build. This is what I've got when I try to use IPC layer inside Docker container.

$ docker-compose run --rm py36dj111 /code/.tox3.6.1/py36-django111/bin/python -i
Starting asgirabbitmq_rabbitmq_1 ... done
Python 3.6.1 (default, Mar 23 2017, 02:34:11)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from asgi_ipc import IPCChannelLayer
>>> layer = IPCChannelLayer()
>>> layer.send('foo', {'baz': 'bar'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/code/.tox3.6.1/py36-django111/lib/python3.6/site-packages/asgi_ipc/core.py", line 61, in send
    channel_size = self.message_store.length(channel)
  File "/code/.tox3.6.1/py36-django111/lib/python3.6/site-packages/asgi_ipc/store.py", line 134, in length
    with self.mutate_value() as value:
  File "/usr/local/lib/python3.6/contextlib.py", line 82, in __enter__
    return next(self.gen)
  File "/code/.tox3.6.1/py36-django111/lib/python3.6/site-packages/asgi_ipc/store.py", line 50, in mutate_value
    value = pickle.load(self.mmap)
_pickle.UnpicklingError: invalid load key, '\x00'.
>>> layer = IPCChannelLayer()
>>> layer.flush()
>>> layer.send('foo', {'baz': 'bar'})

As you can see, send will be successful only after first use of flush. I can't reproduce this behavior outside of Docker container. It works fine on my machine. But Travis builds use Docker for the build, so exactly same error happens in the CI.

https://travis-ci.org/proofit404/asgi_rabbitmq/jobs/236879296#L4355-L4356

It is curious because asgi_ipc and asgi_redis builds passed without any trouble.

Any suggestions for future research?

andrewgodwin commented 7 years ago

I had another report of it working "very slowly" inside Docker this week as well, to the point where I suspect it was not actually working - it's possible something about the IPC communication does not agree with Docker? Based on your error, it looks like the shared memory is not working correctly, but I don't really know how to proceed.

proofit404 commented 7 years ago

I can confirm that this error happens only on python3.

Python 2 build doesn't have this problem. Python 3 build do have.

andrewgodwin commented 7 years ago

Hm. Maybe try changing the pickle format and see if that affects it?

proofit404 commented 7 years ago

But in the traceback above load is called before dump. So if I understand correctly pickle tries to read from uninitialized memory and fails. We can set errors mode into a different mode or tries to flush layer memory on first UnpickleError.

andrewgodwin commented 7 years ago

Ah, it tries to unpickle it and then fails out at EOFError if it's empty - it seems in this case the memory is zeroed but has a length, so it probably just needs to check if it's empty as well.

proofit404 commented 7 years ago

Sorry, I can't find the way to check if memory map or shared memory is empty.

andrewgodwin commented 7 years ago

Well all pickles start with an 0x80 opcode, so checking if the first byte of the mmap is 0x00 should be enough.

proofit404 commented 7 years ago

If this build will be successful with this fix, I'll do PR.

proofit404 commented 7 years ago

Yep, looks like problem solved.

andrewgodwin commented 7 years ago

Great! Closing this then.