aristanetworks / cloudvision-python

Python resources and libraries for integrating with Arista's CloudVision platform
Apache License 2.0
27 stars 18 forks source link

TLSV1_ALERT_NO_APPLICATION_PROTOCOL with DeviceStreamRequest #19

Closed jdrew82 closed 1 year ago

jdrew82 commented 1 year ago

When attempting to get device inventory from the CVP labs (labs.arista.com) using the DeviceServiceStub throws the following error:

E0725 20:11:34.935859344 15741 ssl_transport_security.cc:1420] Handshake failed with fatal error SSL_ERROR_SSL: error:10000460:SSL routines:OPENSSL_internal:TLSV1_ALERT_NO_APPLICATION_PROTOCOL.

Here is the code used to pull the information:

from arista.inventory.v1 import models, services
import grpc

comm_channel = grpc.secure_channel(cvp_url, conn_creds)
device_stub = services.DeviceServiceStub(comm_channel)
req = services.DeviceStreamRequest()
responses = device_stub.GetAll(req)

I'm able to confirm that authentication with username/password works and I receive a token but attempting to pull data immediately throws the error. Here is the relevant portion of the traceback:

Traceback (most recent call last):   File "/usr/local/lib/python3.10/site-packages/grpc/_channel.py", line 475, in __next__     return self._next()   File "/usr/local/lib/python3.10/site-packages/grpc/_channel.py", line 881, in _next     raise self grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:     status = StatusCode.UNAVAILABLE     details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:34.30.52.181:443: Ssl handshake failed: SSL_ERROR_SSL: error:10000460:SSL routines:OPENSSL_internal:TLSV1_ALERT_NO_APPLICATION_PROTOCOL"     debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:34.30.52.181:443: Ssl handshake failed: SSL_ERROR_SSL: error:10000460:SSL routines:OPENSSL_internal:TLSV1_ALERT_NO_APPLICATION_PROTOCOL {created_time:"2023-07-25T17:59:59.52624705+00:00", grpc_status:14}" >

Any help is greatly appreciated.

cianmcgrath commented 1 year ago

Hi Justin, Thanks for opening the issue. It looks to me from an initial look that the connection is failing to be established. I've seen similar issues with SSL handshake/failed to connect to all addresses errors when attempting connections to onprem installations and not using that installation's self-signed cert (usually omitting the cert file args seen in the examples here https://github.com/aristanetworks/cloudvision-python/blob/trunk/examples/resources/inventory/get_versions.py#L107), though I do not remember seeing TLSV1_ALERT_NO_APPLICATION_PROTOCOL specifically before. I'm unfamiliar with the CVP labs (labs.arista.com) so I can't comment as to whether they require such certs passed when creating the client. I'll be investigating this further for now and am looking internally for individuals with further familiarity to comment on setting up connections to the labs.

jdrew82 commented 1 year ago

From my research this appears to be due to something in the communication after the handshake takes place. I found that the error is related to the application that the client offers to the server not matching what the server will accept. This would confirm what I'm seeing as the initial connection to the API to authenticate and get a token works without issue. The error only occurs when we attempt to pull the device inventory. For handling of the endpoints certificate we specifically download it and trust it if we tell our App not to verify the cert as seen here.

Regardless, we appreciate you assisting us in resolving this.

jdrew82 commented 1 year ago

I've looked into this a bit more and I believe the Application that's mentioned in the error is related to the ALPN extension to TLS. I've ran a Wireshark capture while attempting the communication to CVP and I'm showing that it establishes a connection using TLSv1.3 and includes an ALPN definition using HTTP and then at the point where the error occurs I see a three-way handshake, my client sends a Client Hello using TLSv1.2 and includes the ALPN definition specifying grpc-exp and I immediately get the Alert back about No Application protocol.

I've included some screenshots of what I'm seeing:

Screenshot 2023-07-26 at 1 54 46 PM

Here's the Client Hello:

Screenshot 2023-07-26 at 1 56 16 PM

Here's the error response:

Screenshot 2023-07-26 at 1 56 50 PM

I'm honestly not sure why it appears to create a whole new connection instead of using what's already in place.

cianmcgrath commented 1 year ago

The fact it's not reusing what's there is quite confusing alright, and the first I've seen of the like. I've also yet been unable to reproduce the issue myself so I'm escalating the matter further internally to try and identify what the issue could be. Very much appreciate you sharing your findings so far 🙏 It's quite helpful

cianmcgrath commented 1 year ago

Apologies for the delay in responding here. Neither myself nor any of the rAPI team members who've assisted here have been able to reproduce this behaviour, though we do not have access to the labs domain ourselves to try there. After some time though I've been able to get in contact with the CVP Labs domain maintenance team this week. They've informed me that they've seen a similar issue in another labs domain they run (that they recently fixed), and believe the two problems are likely the same. They are currently attempting to apply the same fix to the CVP Labs domain. I'll update here with any further developments as they arrive.

jdrew82 commented 1 year ago

@cianmcgrath Thanks for the update!

jdrew82 commented 1 year ago

I've confirmed that I'm still getting the issue. I've also noticed that the labs SSL cert appears to be expired since April 2022?

cianmcgrath commented 1 year ago

Apologies for the delay here, I was on leave. I've not yet received an update on whether the fix was successful but I've followed up on that. As for the ssl cert issue, they have informed me that the main site ssl cert is valid until later this year. Might it be that you're referring to the lab module cert? (they've said since lab modules are short lived, they don't have certs for most of the modules)

cianmcgrath commented 1 year ago

I've finally received confirmation from the CVP Labs domain maintenance team that the issue has been resolved. Can I ask if you've encountered this issue more recently?

cianmcgrath commented 1 year ago

Closing due to inactivity