narioinc / cube-helm

A helm chart for running cubejs in a scalable and reliable manner
Apache License 2.0
1 stars 1 forks source link

schema #1

Open kodeine opened 1 month ago

kodeine commented 1 month ago

hello,

can you please tell me where is the config to mount the model/ schema folder? Do i need to create the ConfigMap or do you have a better solution?

thank you

narioinc commented 1 month ago

hi Kodeine,

Thanks for trying out the helm chart. Your question is surely going to benefit a lot of folks out there.

Here is my rationale: I had to intentionally leave out the section around ways to mount your schema files as I have see a few implementation out in the wild:

1) Writing a script inside an init container that runs a "git pull" of your schema files inside the container's /cube/conf folder. This folder is already the mount point ( see the volumeMount section of the API deployment. 2) Take your schema files as config files and mount them as files under the /cube/conf folder. 3) Completely give up volume mounts and build you cube container with the schema files part of your image itself ( cube:latest being your docker container's base image

Benefits of each method

I do see 1) being a bit more flexible in terms of dynamically adjusting your apps to schema per environment ( you could pass the env var during deployment and your init container can use your env vars as variables as git branch name to pull for example)

Number 2) is a simpler setup but starts creating issues if your schema files are too large to get inside your k8s cluster as a config map https://kubernetes.io/docs/concepts/configuration/configmap/#:~:text=A%20ConfigMap%20is%20not%20designed,separate%20database%20or%20file%20service. Works for smaller schemas but when running for enterprises, they can quickly get out of hand ( especially if you use cube dbt and tools like dagster to build you schema files)

Number 3) creates a container that is immutable and is self contained and immutability allows some resilience to supply chain attacks since you can verify the full SHA has of your container and schema files as "one unit" before deployment. Although now it means that you need to keep building images for each environment

My humble preference has always been 1) However given these ideas, I would love to know if you see benefit of either of the 3 points.

I will be happy to share concrete yamls for the steps you feel works for you. I will also ensure i can add these as suggestions in the readme for other to benefit from.

Cheers.

narioinc commented 1 month ago

will close the issue once you have a satisfactory answer/approach. thanks.

kodeine commented 1 month ago

@narioinc thank you for your detailed answer. In my opinion schema can change occasionally and its better if we can mount it via git pull. configmap is not viable for larger projects like you mentioned.

Following is the error i'm getting i think its because values.yaml does not have service object, did you mean to use {{- $svcPort := .Values.cubeApi.service.port -}}?

helm template . --name-template cubejs --namespace cubejs --values values.yaml  
Error: template: cubejs/templates/ingress.yaml:3:23: executing "cubejs/templates/ingress.yaml" at <.Values.service.port>: nil pointer evaluating interface {}.port

also another error

error validating data: [ValidationError(StatefulSet.spec.template.spec): unknown field "serviceName" in io.k8s.api.core.v1.PodSpec, ValidationError(StatefulSet.spec): missing required field "serviceName" in io.k8s.api.apps.v1.StatefulSetSpec]
narioinc commented 1 month ago

HI @kodeine

My apologies for the delay in responding back. Got held up in office stuff.

while the chart template hosts an ingress, my original intention was to create a chart where cube is deployed completely internal to a cluster and never exposed out to the public network via ingress/LB

Because of that, i never really intent to allow the ingress object to get created. I could go ahead and make changes to allow ingress to operate, but then I am pretty sure that in your use case, you may still need to edit the ingress definition based on the host, domain, path params, etc

For now, can you quickly check a few things 1) DO no enable ingress: enabled: true. Keep it false. 2) See if your cube servcies come up and you are able to quickly port forward the cubeAPI services to your local machine and you can see the cubejs UI come up.

A quick side not, and a humble suggestion. If you are using this helm chart for an enterprise setup, I would suggest keeping the cube API layer internal and expose a layer on top of it ( a microservice that you can write) to abstract away some of the cubejs specific data models from your end clients. Even though cube is a "read-only" medium, providing any kind of direct access to cube APIs has always been a heated debate. However, if you are using it personally, you can have enough safegaurds to ensure your DB protected against any access. In the meantime, ill try to get an Ingress resource for you mapping cubeapi to the root (/) of your host domain.

Also, a side note, Nginx itself is nowadays asking users to drop the older " Ingress resource" and use the more modern CRD ( Virtualserver and VirtualServerRoute) so I may drop the support for Ingress altogether in favor of VS/VSR or a Kubernetes Gateway resource

Your thoughts on this ?

kodeine commented 1 month ago

Hello,

So i was able to update the chart to fix the ingress and initContainer. ingress is exposed as cluster ip to a load balaancer and i am using oauth2 for exposing it to the load balancer. Other than that i do use a microservice which uses the internal host of svc to utilize the api.

i will submit the PR to the repo shortly, that fixes the bug in ingress and service along with new feature of init container.

narioinc commented 1 month ago

Thanks

Will review your changes and pull it in. 👍👍

On Fri, 27 Sep, 2024, 14:20 Kodeine, @.***> wrote:

Hello,

So i was able to update the chart to fix the ingress and initContainer. ingress is exposed as cluster ip to a load balaancer and i am using oauth2 for exposing it to the load balancer. Other than that i do use a microservice which uses the internal host of svc to utilize the api.

i will submit the PR to the repo shortly, that fixes the bug in ingress and service along with new feature of init container.

— Reply to this email directly, view it on GitHub https://github.com/narioinc/cube-helm/issues/1#issuecomment-2378778169, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7U2KPWWJMBUPMEFOX45Q3ZYUL4VAVCNFSM6AAAAABOZED53WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZYG43TQMJWHE . You are receiving this because you were mentioned.Message ID: @.***>