golemfactory / golem-messages

shared module for formatting and parsing messages (Golem & Concent)
GNU General Public License v3.0
2 stars 8 forks source link

InvalidSignature sometimes raised when decoding a message with a FrozenDict on Python 3.5.2 #70

Closed Jakub89 closed 6 years ago

Jakub89 commented 6 years ago

There is a problem with using datastructures.FrozenDict with our FileTransferToken. It produce un-deterministic errors golem_messages.exceptions.InvalidSignature. One idea is that dict changes the order of those three values. Also that bug appear when tested on Python 3.5.2, but disappear with new 3.6. Script we use for testing:

import datetime

from golem_messages.shortcuts       import dump, load
from golem_messages.message         import FileTransferToken, TokenFilesData

CONCENT_PUBLIC_KEY  = b'\xf3\x97\x19\xcdX\xda\x86tiP\x1c&\xd39M\x9e\xa4\xddb\x89\xb5,]O\xd5cR\x84\xb85\xed\xc9\xa17e,\xb2s\xeb\n1\xcaN.l\xba\xc3\xb7\xc2\xba\xff\xabN\xde\xb3\x0b\xa6l\xbf6o\x81\xe0;'
CONCENT_PRIVATE_KEY = b'l\xcdh\x19\xeb$>\xbcG\xa1\xc7v\xe8\xd7o\x0c\xbf\x0e\x0fM\x89lw\x1e\xd7K\xd6Hnv$\xa2'

message_timestamp = 1546171200
token                                 = FileTransferToken(timestamp=message_timestamp)
token.token_expiration_deadline       = message_timestamp + 3600
token.storage_cluster_address         = 'http://storage.concent.golem.network/'
token.authorized_client_public_key    = b'\xf3\x97\x19\xcdX\xda\x86tiP\x1c&\xd39M\x9e\xa4\xddb\x89\xb5,]O\xd5cR\x84\xb85\xed\xc9\xa17e,\xb2s\xeb\n1\xcaN.l\xba\xc3\xb7\xc2\xba\xff\xabN\xde\xb3\x0b\xa6l\xbf6o\x81\xe0;'

token.files             = TokenFilesData()
token.files['path']     = 'blender/benchmark/test_task/scene-Helicopter-27-cycles.blend'
token.files['checksum'] = '098f6bcd4621d373cade4e832627b4f6'
token.files['size']     = 1024
token.operation         = 'download'

serialize_token = dump(token, CONCENT_PRIVATE_KEY, CONCENT_PUBLIC_KEY)

loaded_golem_message = load(serialize_token, CONCENT_PRIVATE_KEY, CONCENT_PUBLIC_KEY, check_time = False)

golem.messages version:

-e git+git@github.com:Code-Poets/golem-messages.git@434ea988aeb83ed0976c55685245e213f2f65c1a#egg=Golem_Messages
cameel commented 6 years ago

@jiivan We've seen this happen in our tests but only sometimes. We have distilled the issue into the script posted by @Jakub89 above. If you run it a few times, you should get InvalidSignature, though not every time.

We've seen it on @Jakub89's system (Ubuntu) with Python 3.5.2 and in a Debian Stretch container running Python 3.5 on the cluster. It does not seem to happen on systems running Python 3.6.

@rwrzesien Points out that there were some changes in hashlib in Python 3.6. Specifically related to SHA3 implementation. See hashlib > Hash Algorithms in Python 3.6 docs. But I don't think Golem is using SHA3 so this does not seem to be the issue.

In any case, the immediate source of the failure is that the public key extracted from the message and the signature inside ecdsa_verify() differs from the original key.

Gr33ny commented 6 years ago

@jiivan I prepared a Dockerfile with configuration to test it.

Dockerfile:

FROM debian:stretch

RUN apt-get --assume-yes update
RUN apt-get --assume-yes install --no-install-recommends python3-pip \
                                                         python3-virtualenv \
                                                         virtualenv gcc \
                                                         libssl1.0.2 \
                                                         libssl1.0-dev \
                                                         python3-dev \
                                                         git

COPY test.py                                   /tmp/
COPY install-golem-messages.sh           /tmp/
RUN  /tmp/install-golem-messages.sh

CMD ["/srv/http/virtualenv/bin/python", "/tmp/test.py"]

install-golem-messages.sh:

#!/bin/bash -e

virtualenv_dir=/srv/http/virtualenv
python_version=3.5

# Create virtualenv and Django app's dependencies
virtualenv --python python3 "$virtualenv_dir"
source "$virtualenv_dir/bin/activate"
pip install --upgrade pip
pip install -e git://github.com/Code-Poets/golem-messages.git@434ea988aeb83ed0976c55685245e213f2f65c1a#egg=Golem_Messages

# pyelliptic is unmaintained and does not work with OpenSSL 1.1.
# We have the 1.0 version of libssl installed but we must patch pyelliptic to use it.
sed                                                                                          \
    "s%ctypes.util.find_library('crypto')%'/usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.2'%"   \
    --in-place $virtualenv_dir/lib/python$python_version/site-packages/pyelliptic/openssl.py

Take the @Jakub89 script test, name it "test.py" and save in the same directory where you locate the Dockerfile and install-golem-messages.sh.

Use below command to build and run it.

docker build -t <NAME> <PATH>
docker run -it <NAME>

For example:

docker build -t test .
docker run -it test

where PATH is the directory with Dockerfile, test - "test.py" and script - "install-golem-messages.sh".

This error is not regular. Sometimes you must lauch the command:

docker run -it <NAME>

several times to get an issue:

Traceback (most recent call last): File "/tmp/test.py", line 24, in loaded_golem_message = load(serialize_token, CONCENT_PRIVATE_KEY, CONCENT_PUBLIC_KEY, check_time = False) File "/srv/http/virtualenv/src/golem-messages/golem_messages/shortcuts.py", line 18, in load cryptography.ecdsa_verify(pubkey, msg.sig, msg.get_short_hash()) File "/srv/http/virtualenv/src/golem-messages/golem_messages/cryptography.py", line 96, in ecdsa_verify raise exceptions.InvalidSignature() golem_messages.exceptions.InvalidSignature

cameel commented 6 years ago

@jiivan We managed to reproduce it in a Docker container posted by @Gr33ny above. Unfortunately only on this branch based on v1.2.1: https://github.com/Code-Poets/golem-messages/tree/token-test

We're still having this problem on v1.4.1, though in a somewhat inconsistent manner and we did not manage to create a container that would reproduce it reliably (we're still trying). This often results in problems with decoding golem messages in versions that we deploy to the cluster and the client gets HTTP 400 even though its message is valid.

The branch where it fails had an old version of FileTransferToken that was later amended. We suspect that the problem might be related to serializing FrozenDict.

Could you run the container and check why it's failing on this older version?

jiivan commented 6 years ago

When comparing 434ea988aeb83ed0976c55685245e213f2f65c1a with master I can see that it's before #67 This could fail because timestamps were floats and were divided and multiplied.

If it still happens on 1.4.1 I'll investigate on Monday.

jiivan commented 6 years ago

I confirm this issue on python3.5. It can be spotted easily as signature is different on every invocation. In python3.6 signature is stable.

mfranciszkiewicz commented 6 years ago

A new feature introduced in Python 3.6 is that when you iterate over a dict, item order is always the same. So the issue lies in messages containing dicts, which are not sorted prior to serialization.

Previously, a binary _payload variable was used for signature verification, which bypassed that problem.

jiivan commented 6 years ago

Probably related to this change: bpo-27350: dict implementation is changed like PyPy. It is more compact and preserves insertion order. (Concept developed by Raymond Hettinger and patch by Inada Naoki.)

(thanks @mfranciszkiewicz for pointing me to dict changes in 3.6)

Fix on the way.