gtfierro / xboswave

eXtensible Building Operating System built on WAVE and WAVEMQ
BSD 3-Clause "New" or "Revised" License
10 stars 6 forks source link

Fetching data using pyxbos over GRPC fails #13

Closed anandkp92 closed 5 years ago

anandkp92 commented 5 years ago

I've created server (mortar.ent, serverproof.pem) and client (client.ent, clientproof.pem) entities and proofs to query data using pyxbos, following the instructions mentioned here.

$ wv verify serverproof.pem 
Referenced attestations:
 [00] Hash: GyAiKGVKtH2nVeYTXxUauroo7vnsavFvtvWZQNDFwWdjFg==
Paths:
 [00] 00
Subject: GyBJlCoeeB_ooFTV-IOf1rBbAgWLqakTcoUpxormr1yfvg==
SubjectLoc: default
Expires: 2019-06-08 12:00:35 -0700 PDT
Policy: RTree
 Namespace: <namespace-hash>
 Indirections: 0
 Statements:
 [00] Permission set: GyC5wUUGKON6uC4gxuH6TpzU9vvuKHGeJa1jUr4G-j_NbA==
      Permissions: serve_grpc
      URI: mortar/Mortar/*
$ wv verify clientproof.pem 
Referenced attestations:
 [00] Hash: GyAkPFCNR0CXM1RH5UucgNeZrVmpcaiY6D7ZSk9biNv0xQ==
Paths:
 [00] 00
Subject: GyDdeCnf8wmp8WeuoVtK_xqXCVhjHmUg3y8IS02mYPbtHw==
SubjectLoc: default
Expires: 2019-06-08 12:04:50 -0700 PDT
Policy: RTree
 Namespace: <namespace-hash>
 Indirections: 0
 Statements:
 [00] Permission set: GyC5wUUGKON6uC4gxuH6TpzU9vvuKHGeJa1jUr4G-j_NbA==
      Permissions: call_grpc
      URI: mortar/Mortar/*

I've changed the FetchRequest in the (example)[https://github.com/gtfierro/xboswave/blob/master/python/examples/mortartest.py] file to fetch a uuid stored in my influx database.

However, when I run python mortartest.py, I get the following errors:

~/xboswave/python/examples$ python mortartest.py 
Listening on ('localhost', 5005)
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/home/solarplus/xboswave-1/python/pyxbos/pyxbos/mortard.py", line 104, in get_client_connection
    self.setup_connection()
  File "/home/solarplus/xboswave-1/python/pyxbos/pyxbos/mortard.py", line 72, in setup_connection
    hs = self.upstream_connection.handshakeClientCert()
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/tlslite/tlsconnection.py", line 308, in handshakeClientCert
    for result in handshaker:
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/tlslite/tlsconnection.py", line 324, in _handshakeClientAsync
    for result in self._handshakeWrapperAsync(handshaker, checker):
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/tlslite/tlsconnection.py", line 1777, in _handshakeWrapperAsync
    for result in handshaker:
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/tlslite/tlsconnection.py", line 418, in _handshakeClientAsyncHelper
    for result in self._clientGetServerHello(settings, clientHello):
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/tlslite/tlsconnection.py", line 559, in _clientGetServerHello
    HandshakeType.server_hello):
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/tlslite/tlsrecordlayer.py", line 722, in _getMsg
    raise TLSRemoteAlert(alert)
tlslite.errors.TLSRemoteAlert: unexpected_message

Traceback (most recent call last):
  File "mortartest.py", line 53, in <module>
    res = client.fetch(req)
  File "/home/solarplus/xboswave-1/python/pyxbos/pyxbos/mortard.py", line 225, in fetch
    for a in self._client.Fetch(request):
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/grpc/_channel.py", line 363, in __next__
    return self._next()
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/grpc/_channel.py", line 357, in _next
    raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
    status = StatusCode.UNAVAILABLE
    details = "Connection reset by peer"
    debug_error_string = "{"created":"@1557531414.765540810","description":"Error received from peer ipv4:127.0.0.1:5005","file":"src/core/lib/surface/call.cc","file_line":1041,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
gtfierro commented 5 years ago

Can you attach the code that produces the error?

Make sure that the way that you set up the client is the way we do it here

anandkp92 commented 5 years ago

This is the code that I'm running:

from pyxbos.mortard import MortarClient
import pymortar
import time

client = MortarClient({
    'namespace':  "<namspace-hash>,
    'wave': 'localhost:777',
    'entity': 'client.ent',
    'prooffile': 'clientproof.pem',
    'grpcservice': 'mortar/Mortar/*',
    'address': 'localhost:4587',
})

req = pymortar.FetchRequest(
    sites=["blr"],
    dataFrames=[
        pymortar.DataFrame(
            name="meter_data",
            aggregation=pymortar.MEAN,
            window="5m",
            uuids=["<uuid-from-our-db>"],
        )
    ],
    time=pymortar.TimeParams(
        start="2019-01-01T00:00:00Z",
        end="2019-06-01T00:00:00Z",
    )
)
s = time.time()
res = client.fetch(req)
e = time.time()
print("took {0}".format(e-s))
print(res)

I think mortard.MortarClient does set up the WAVEGRPCClient as you described.

gtfierro commented 5 years ago

I can't reproduce the error using this code. Are you using Python 2 or Python 3?

anandkp92 commented 5 years ago

Python 3.5, i created a conda environment to run this. Could it be some sort of firewall issue?

gtfierro commented 5 years ago

This is all over localhost, so firewalls shouldn't affect it. Can you check that you have tlslite-ng installed?

pip install tlslite-ng

If you have a packaged tlslite installed you should remove it.

gtfierro commented 5 years ago

I added the correct tlslite-ng package as a dependency in pyxbos 0.2.2

anandkp92 commented 5 years ago

Yes, I'd installed the wrong tlslite instead of tlstlite-ng! Also upgraded to pyxbos 0.2.2.

That worked!

But now there is a thread that doesn't exit:

$ python mortartest.py 
Listening on ('localhost', 5005)
new client call
Done with call
                           ccd06506-a4df-59f2-85eb-5e5e121f7c89
2019-05-09 04:25:00+00:00                             55.342500
2019-05-09 04:30:00+00:00                             55.277857
2019-05-09 04:35:00+00:00                             55.172667
2019-05-09 04:40:00+00:00                             54.112000
2019-05-09 04:45:00+00:00                             53.435714
took 2.4387152194976807
<pymortar.result.Result: views:n/a dataframes:1 timeseries:1 vals:5>

^CException ignored in: <module 'threading' from '/home/solarplus/anaconda3/envs/python35/lib/python3.5/threading.py'>
Traceback (most recent call last):
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/threading.py", line 1292, in _shutdown
    t.join()
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/threading.py", line 1054, in join
    self._wait_for_tstate_lock()
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/threading.py", line 1070, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt

And on the xbosmortard side (seems like there are no errors):

...
INFO[2019-05-13T13:33:34-07:00] Done loading Brick. Took 969.083231ms        
INFO[2019-05-13T13:33:34-07:00] Connected to InfluxDB!                       
INFO[2019-05-13T13:33:34-07:00] <|influx ts stage|>                          
INFO[2019-05-13T13:33:34-07:00] <| brick stage |>                            
INFO[2019-05-13T13:33:34-07:00] <| api frontend wave auth stage |>           
INFO[2019-05-13T13:33:34-07:00] get output                                   
2019/05/13 13:33:36 [INFO] Server verifying server handshake <namespace-hash>
ERRO[2019-05-13T13:33:37-07:00] Got Error in ret <nil>                       
INFO[2019-05-13T13:33:37-07:00] Fetch took 86.291774ms   

My doubt is if we close the grpc connection, but this loop still runs.

What do you think?

gtfierro commented 5 years ago

I'll take a look at that later today. In the prototype client I uploaded, something was terminating the GRPC connection each time, so I attempted to re-establish it. What probably needs to happen is to move the setup_connection call out of the infinite loop so that we only establish that once.

anandkp92 commented 5 years ago

OK. And I stand corrected - the loop seems to be terminating, but yes, maybe the connection still remains. Thanks!

gtfierro commented 5 years ago

I forgot to daemonize the threads in the client. Can you upgrade to pyxbos 0.2.3 and try again? Unfortunately, connection re-use is still not working so the latency is super high (for now)

anandkp92 commented 5 years ago

That didn't work.

$ python mortartest.py 
Listening on ('localhost', 5005)
Done with call
Traceback (most recent call last):
  File "mortartest.py", line 53, in <module>
    res = client.fetch(req)
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/pyxbos/mortard.py", line 222, in fetch
    for a in self._client.Fetch(request):
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/grpc/_channel.py", line 363, in __next__
    return self._next()
  File "/home/solarplus/anaconda3/envs/python35/lib/python3.5/site-packages/grpc/_channel.py", line 346, in _next
    raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
    status = StatusCode.UNAVAILABLE
    details = "Socket closed"
    debug_error_string = "{"created":"@1557782540.272304202","description":"Error received from peer ipv4:127.0.0.1:5005","file":"src/core/lib/surface/call.cc","file_line":1041,"grpc_message":"Socket closed","grpc_status":14}"
>

Now there is an error on the grpc server as well:

$ ./xbosmortard 
...
2019/05/13 14:22:18 [INFO] Server verifying server handshake <namespace-hash>
ERRO[2019-05-13T14:22:20-07:00] timing out on waiting for result in fetch
INFO[2019-05-13T14:22:20-07:00] Fetch took 1.087840719s

Could this be due to a small timeout for fetch? This is the line in your code that prints that error.

gtfierro commented 5 years ago

If you follow the definition for the timeout, you'll find that its like 60 minutes, so unlikely to be that :). The culprit was I didn't finish fixing the influxdb stage implementation when I changed how mortar handles client requests contexts. If you update xboswave, the updated mortar stage should be in there now. Let me know if you see the issue after it updates to mortar v1.1.0-alpha4

anandkp92 commented 5 years ago

Perfect! That worked! Thanks a lot.

You're right about the latency, but this is great to move ahead.