databricks / databricks-vscode

VS Code extension for Databricks
Other
125 stars 22 forks source link

[BUG] SSL Handshake fails on Zscaler #1192

Open vpacik opened 7 months ago

vpacik commented 7 months ago

Describe the bug SSL Handshake fails when trying to run pyspark code via Databricks-connect locally on the machine with corporate VPN (Zscaler) running WSL2.

To Reproduce Steps to reproduce the behavior:

  1. Create simple py script invoking spark from DatabricksSession:
    from databricks.connect import DatabricksSession
    spark = DatabricksSession.builder.getOrCreate()
    spark.range(10).show()
  2. Click on 'Run Python File'
  3. See error:
    E0417` 10:14:56.342496763  337339 ssl_transport_security.cc:1519]       Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.
    Traceback (most recent call last):
    File "/home/vpacik/Codes/db-connect-test/spark-test.py", line 5, in <module>
    spark.range(10).show()
    File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/dataframe.py", line 996, in show
    print(self._show_string(n, truncate, vertical))
    File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/dataframe.py", line 753, in _show_string
    ).toPandas()
    File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/dataframe.py", line 1655, in toPandas
    return self._session.client.to_pandas(query)
    File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 798, in to_pandas
    table, schema, metrics, observed_metrics, _ = self._execute_and_fetch(req)
    File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1172, in _execute_and_fetch
    for response in self._execute_and_fetch_as_iterator(req):
    File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1153, in _execute_and_fetch_as_iterator
    self._handle_error(error)
    File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1308, in _handle_error
    self._handle_rpc_error(error)
    File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1348, in _handle_rpc_error
    raise SparkConnectGrpcException(str(rpc_error)) from None
    pyspark.errors.exceptions.connect.SparkConnectGrpcException: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:20.42.4.211:443: Ssl handshake failed: SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:20.42.4.211:443: Ssl handshake failed: SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED", grpc_status:14, created_time:"2024-04-17T10:14:56.343025663+02:00"}"

System information: Version: 1.88.1 (user setup) Commit: e170252f762678dec6ca2cc69aba1570769a5d39 Date: 2024-04-10T17:41:02.734Z Electron: 28.2.8 ElectronBuildId: 27744544 Chromium: 120.0.6099.291 Node.js: 18.18.2 V8: 12.0.267.19-electron.0 OS: Windows_NT x64 10.0.22621

Additional context We are using WSL2 on the machine with corporate VPN (Zscaler) with the exported root CA used. Pinging the domain with this certificate via openssl works fine (like openssl s_client -connect {servername}:443) Databricks CLI is working fine on the same machine. File synchronization via Databricks-Connect also works as expected. EDIT: Authentication is done via PAT from DBX.

stevenayers-bge commented 1 month ago

@vpacik I've had the same issue, it's because the spark connect client uses GRPC not HTTP, so to resolve the SSL error you need to set GRPC_DEFAULT_SSL_ROOTS_FILE_PATH

You may run into more issues though, details here: https://community.databricks.com/t5/administration-architecture/proxy-zscaler-amp-databricks-spark-connect-quot-cannot-check/m-p/94737#M2115