Thriftpy / thriftpy2

Pure python approach of Apache Thrift.
MIT License
572 stars 91 forks source link

Added Support For Apache JSON and Binary Data #139

Closed JonnoFTW closed 3 years ago

JonnoFTW commented 4 years ago

13 Added support for apache json, so that apache thrift server/clients can talk to thriftpy2 server/clients.

136 Added support for binary data fields, I'm not sure what ramifications this would have on python2.

codecov[bot] commented 4 years ago

Codecov Report

Merging #139 into master will increase coverage by 0.91%. The diff coverage is 88.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #139      +/-   ##
==========================================
+ Coverage   83.14%   84.05%   +0.91%     
==========================================
  Files          43       44       +1     
  Lines        3921     4133     +212     
==========================================
+ Hits         3260     3474     +214     
+ Misses        661      659       -2     
Impacted Files Coverage Δ
thriftpy2/contrib/aio/rpc.py 84.44% <ø> (ø)
thriftpy2/contrib/aio/protocol/binary.py 55.73% <21.42%> (-2.07%) :arrow_down:
thriftpy2/contrib/aio/protocol/compact.py 60.17% <50.00%> (-0.82%) :arrow_down:
thriftpy2/protocol/binary.py 89.25% <76.47%> (-1.06%) :arrow_down:
thriftpy2/protocol/compact.py 84.39% <89.47%> (+2.13%) :arrow_up:
thriftpy2/protocol/apache_json.py 97.51% <97.51%> (ø)
thriftpy2/protocol/__init__.py 87.50% <100.00%> (+0.83%) :arrow_up:
thriftpy2/protocol/json.py 96.63% <100.00%> (+13.88%) :arrow_up:
thriftpy2/thrift.py 88.88% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 17a66b5...89c4cb1. Read the comment docs.

ethe commented 4 years ago

Some CI has failed, could you please fix it?

JonnoFTW commented 4 years ago

@ethe don't merge yet, I have some problems in my own codebase where TApacheJSONProtocol does not work with THttpClient

JonnoFTW commented 4 years ago

@ethe feel free to check it out now

ethe commented 4 years ago

Good job and thank you, if we can confirm that it does not change features which already exist, I think we can merge it right now.

JonnoFTW commented 4 years ago

@ethe I updated asyncio protocols to support binary fields and so far all the tests work. Might be a need to do some interoperability tests with other implementations.

aisk commented 4 years ago

Hi @JonnoFTW I have a question, is the changes for #136 will make a incompatible change for current users for both python2 & 3? If so, I think we should and a flag to enable this feature, for we don't breaking current users' codes in minor changes.

aisk commented 4 years ago

Hi @JonnoFTW I have a question, is the changes for #136 will make a incompatible change for current users for both python2 & 3? If so, I think we should and a flag to enable this feature, for we don't breaking current users' codes in minor changes.

Hi @JonnoFTW what about this question?

JonnoFTW commented 3 years ago

@aisk tests pass for python2 and python3 so I don't think it breaks and existing code since the old tests still pass with the assumption that they have coverage for STRING and BINARY thrift types.

aisk commented 3 years ago

@aisk tests pass for python2 and python3 so I don't think it breaks and existing code since the old tests still pass with the assumption that they have coverage for STRING and BINARY thrift types.

Really sorry for the late of reply. I tried to understand the changes for #136 and took a dig and found there may have some breaking changes.

For example, before this PR, if we have a thrift file that returns a binary field, and it's content is valid utf-8 contents like "hello", the user will got an unicode in Python2 (str in Python3), but after this PR, he will got str in Python (bytes in Python3). If the content only contains ASCII, maybe we can use str / unicode mixed up, but for other cases this will brings trouble. If this breaking change is true, I think we should add a global switch to enable this behavior.

I think maybe our test case does not covered this case, if you have any doubt about this, maybe I can add some tests for this.

But for now, I think it's cool if you can split this PR into two, the changes for #13 looks good to me.

JonnoFTW commented 3 years ago

I've created a small test repo to check that my changes don't break backwards compatibility with Apache Thrift's python implementation (and hopefully other versions too).

https://github.com/JonnoFTW/thrift_interop_test/blob/master/tests/test_interop.py

aisk commented 3 years ago

Hi @JonnoFTW I approved this PR, thanks!

JonnoFTW commented 3 years ago

@ethe are you happy to merge now? All tests pass along with those in my interoperability repo.

ethe commented 3 years ago

Merged, thank you for your patience and contribution.

dmulter commented 3 years ago

I've been using Thriftpy2 with a private Thrift server for a long time now, and this latest addition has broken things for me. Sorry I can't share the Thrift files, but here's the error when making an RPC call:

.virtualenv/lib/python3.8/site-packages/thriftpy2/thrift.py:219: in _req
    return self._recv(_api)
.virtualenv/lib/python3.8/site-packages/thriftpy2/thrift.py:231: in _recv
    fname, mtype, rseqid = self._iprot.read_message_begin()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   cybin.ProtocolError: No protocol version header

thriftpy2/protocol/cybin/cybin.pyx:454: ProtocolError
JonnoFTW commented 3 years ago

@dmulter are you able to provide a test case to recreate this? Also did you install from pypi or from git?

dmulter commented 3 years ago

Normally I would provide a test case, but I'm not sure I can in this case. I'm not responsible for the server, it's complex, and under NDA. It uses TCompactProtocol and TMemoryBuffer transport. I will see what I can do. It just started failing when I did the update.

I install your package from PyPI. I do a pretty standard make_client and the first Thrift function I call fails with this error. I've seen this error is typical when the transport is wrong somehow. I'm sure you hear this all the time, but nothing else has changed and I've produced numerous production releases of my client for many years.

JonnoFTW commented 3 years ago

Can you try installing from git?

dmulter commented 3 years ago

Fails the same way.

JonnoFTW commented 3 years ago

@dmulter can you please give a minimum working example or just tell me exactly what combination of client/server protocols and transports you are using on each end. I'll add it to the tests and see if I can recreate the problem.

ethe commented 3 years ago

@dmulter could you please provide a sample IDL file? we need to reproduce it.

JonnoFTW commented 3 years ago

@dmulter I'm unable to replicate the problem you are having, here's my test case (it still passes if I use TCompactProtocol):

Spec:

struct BinarySpec {
    1: map<string, binary> str2bin,
    2: map<string, string> str2str,
}

service BinaryService {
    BinarySpec run(1: BinarySpec b_spec);
}
import time
from thriftpy2 import load
from thriftpy2.transport.memory import TCyMemoryBuffer
from thriftpy2.protocol import cybin
from thriftpy2.thrift import TType
from thriftpy2.rpc import make_client, make_server
from multiprocessing import Process

def test_tcompact_memory_buffer():
    spec = load("binary_test.thrift")

    factory = cybin.TCyBinaryProtocolFactory()
    obj = spec.BinarySpec(
        str2bin={"key": b"Binary Value"},
        str2str={"key": "String Value"}
    )

    class Handler:
        def run(self, b_spec):
            print("\nServer received:")
            print(b_spec)
            return b_spec

    def run_server():
        server = make_server(
            service=spec.BinaryService,
            handler=Handler(),
            proto_factory=factory,
        )
        server.serve()

    proc = Process(target=run_server)
    try:
        proc.start()
        time.sleep(0.5)
        client = make_client(spec.BinaryService, proto_factory=factory)
        res = client.run(obj)
        assert res
        print("Server sent:")
        print(res)
    finally:
        proc.kill()
    time.sleep(0.2)

Gives the output:

Server received:
BinarySpec(str2bin={'key': b'Binary Value'}, str2str={'key': 'String Value'})
Server sent:
BinarySpec(str2bin={'key': b'Binary Value'}, str2str={'key': 'String Value'})

If you could modify this test so that it fails in the same way your code does, that would be very useful in fixing the issue.

dmulter commented 3 years ago

Some additional status:

xxx.thrift:

enum ApiVersion
{
    kApiVersion1 = 1;
    kApiVersion2;
    kApiVersion3;
}
const ApiVersion kServerApiVersion_Current = ApiVersion.kApiVersion2;

service PrivateService
{
    void openSession
        (
        1: string sessionId,
        2: ApiVersion clientVersion
        )
}

test_xxx.py:

import time
import sys
from thriftpy2 import load
from thriftpy2.rpc import make_client, make_server
from multiprocessing import Process

class Handler:
    def openSession(self, sessionId, clientVersion):
        print(f"\nServer received: sessionId={sessionId} clientVersion={clientVersion}")
        sys.stdout.flush()

def run_server():
    print("Starting server...")
    server = make_server(
        service=spec.PrivateService,
        handler=Handler(),
    )
    server.serve()

spec = load("xxx.thrift")

def test_private_service():
    proc = Process(target=run_server)
    try:
        proc.start()
        time.sleep(0.5)

        print("Calling openSession...")
        client = make_client(service=spec.PrivateService)
        session_id = "xxx"
        client.openSession(session_id, spec.kServerApiVersion_Current)
        assert False, "tests actually worked" # so we can see print output
    finally:
        proc.kill()
    time.sleep(0.2)
dmulter commented 3 years ago

Installed Cython and built from source. Currently experimenting with what's returned from the server. Not sure if it helps, but the size value it expects for the protocol version header has a value of 3c68746d. Trying to get the full server response...

dmulter commented 3 years ago

Found the problem and it was not this PR, but a different change on 0.4.13. The protocol error was because HTML for an error page was being returned. The breaking change was due to an inserted path separator after port as I'm passing a parsed URL as follows:

    from urllib.parse import urlparse

    url = urlparse(full_url)
    client = make_client(
        service=spec.TestService,
        scheme=url.scheme,
        host=url.netloc,
        path=url.path,
        port="443"
    )

And the change was https://github.com/Thriftpy/thriftpy2/pull/148/commits/faf10551188fc23e3293c526e1142abbdf2a9d14

JonnoFTW commented 3 years ago

@ethe considering that the @dmulter 's issue didn't come from this pull request, can we merge it back in? The slash problem can be fixed separately.

ethe commented 3 years ago

@JonnoFTW yeah your merge does not break compact protocol, however, there is another problem, see #156