voltrondata / spark-substrait-gateway

Implements a gateway that speaks the SparkConnect protocol and drives a backend using Substrait (over ADBC Flight SQL).
Apache License 2.0
15 stars 8 forks source link

feat: add click, auth, TLS, Dockerfile, and helm-chart #57

Closed prmoore77 closed 3 weeks ago

prmoore77 commented 1 month ago

This PR implements the following changes:

  1. Creates script for running the server: spark-substrait-gateway-server
  2. Creates script for running the client demo: spark-substrait-client-demo
  3. Makes the port # configurable
  4. Adds click to facilitate arguments for running the server/client demo
  5. Adds a helm chart - used to deploy the solution to Kubernetes
  6. Adds a Dockerfile to build a container image
  7. Adds TLS for encrypting traffic to/from the server
  8. Adds JWT (token) authentication option for the server/client
  9. Adds utility to create JWT token:
github-actions[bot] commented 1 month ago

ACTION NEEDED

Substrait follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

prmoore77 commented 1 month ago

Converted to draft as tests are failing

prmoore77 commented 1 month ago

I believe the PR is ready for review. Thank you!

EpsilonPrime commented 1 month ago

Oh, and this is awesome work. Lots of good stuff here. Thanks!

amoeba commented 1 month ago

Nice work! I tried out the Helm chart just to see if it worked out of the box and I found two things:

  1. I needed to create a TLS keypair first and copy that into helm-chart/secrets/tls before the main pod would start without crashing.
  2. Once I did (1) and got the deployment to succeed, I ran

    SparkSession.builder.remote("sc://localhost:50051").getOrCreate()

    which stalls with this being printed in the server,

    E0807 19:26:21.123973695      22 ssl_transport_security.cc:1519]       Handshake failed with fatal error SSL_ERROR_SSL: error:0A00010B:SSL routines::wrong version number.
prmoore77 commented 1 month ago

Oh, and this is awesome work. Lots of good stuff here. Thanks!

Thank you, @EpsilonPrime ! You did great work on this neat project.

prmoore77 commented 1 month ago

Nice work! I tried out the Helm chart just to see if it worked out of the box and I found two things:

  1. I needed to create a TLS keypair first and copy that into helm-chart/secrets/tls before the main pod would start without crashing.

  2. Once I did (1) and got the deployment to succeed, I ran

    SparkSession.builder.remote("sc://localhost:50051").getOrCreate()

    which stalls with this being printed in the server,

    
    E0807 19:26:21.123973695      22 ssl_transport_security.cc:1519]       Handshake failed with fatal error SSL_ERROR_SSL: error:0A00010B:SSL routines::wrong version number.
    

I too had issues with self-signed certificates. A signed letsencrypt cert seems to work fine. I am not sure yet why self-signed certs are having issues, but I'll keep investigating... thanks for your help!