sassoftware / viya4-deployment

This project contains Ansible code that creates a baseline in an existing Kubernetes environment for use with the SAS Viya Platform, generates the manifest for an order, and then can also deploy that order into the Kubernetes environment specified.
Apache License 2.0
69 stars 62 forks source link

Issue with Downloading Entitlements File #548

Closed SASCloudLearner closed 2 months ago

SASCloudLearner commented 2 months ago

Viya4 Deployment Version Details

2023.10 LTS on on prem k8 cluster

Ansible Variable File Details

SAS API Access

V4_CFG_SAS_API_KEY: "" V4_CFG_SAS_API_SECRET: "" V4_CFG_ORDER_NUMBER: "" V4_CFG_CADENCE_VERSION: "2023.10" V4_CFG_CADENCE_NAME: "lts"

CR Access

V4_CFG_CR_USER: V4_CFG_CR_PASSWORD: V4_CFG_CR_URL: jfrog

Steps to Reproduce

deploy infra thorugh k8 iac on vpshere deploy viya through sas deployment project. proxy exists

Expected Behavior

sas pods should be starting

Actual Behavior

we see that no pods are created in sas namespace and when i run kubectl describe sasdeployment -n viyaontw i see below error Messages: Error loading entitlements file: "https://ses.sas.download/ses/entitlements.json" Failed to get 'https://ses.sas.download/ses/entitlements.json' Get "https://ses.sas.download/ses/entitlements.json": dial tcp 149.173.160.82:443: connect: connection timed

Additional Context

No response

References

No response

Code of Conduct

SASCloudLearner commented 2 months ago

does anyone has similar experience on why this could happen. k8 clusters is installed through iac and we are deploying sas through this git hub project. and we have sas images in private jfrog repo and we are using certmanager. we are really struck at this issue. Unfortunately the ingress as well does not pick an external ip and pending state. Any advise would help or someone else experience that used similar deployment

miaeyg commented 2 months ago

Hi @SASCloudLearner ,

I have no experience with this specific configuration but it seems like SAS is trying to reach out to https://ses.sas.download/ses/entitlements.json and is blocked (by the proxy?). Not sure why it is reaching out to SAS CR is you are using JFrog. On the other hand, the parameters you posted here seem to be lacking. Not sure if this is just for masking sensitive data or this is really how you entered the values in ansible-vars.yaml. For example:

V4_CFG_SAS_API_KEY: ""
V4_CFG_SAS_API_SECRET: ""
V4_CFG_ORDER_NUMBER: ""
V4_CFG_CR_USER:
V4_CFG_CR_PASSWORD:

In addition you wrote V4_CFG_CR_URL: jfrog is "jfrog" the true URL for accessing the container registry in JFrog? Does it not require authentication? Perhaps you should attach the full ansible-vars.yaml you used.

SASCloudLearner commented 2 months ago

Hello Eyal, Thank you for your response. Yes, i removed the values as its sensitive data. And we have a proxy and on Jump server we set the http_proxy/ https_proxy as well though we still get ssl cert error when we do curl and also though we install the sas ca cert in there. I believe viayadeployment is not expected to reach out to sas download site when we have internal jfrog container repo.

Here are the details from ansible-vars file and we tried various options but with same results. Also, any idea about why ingress could not be able to pickup external ip. we are using kubevip. environment: https_proxy: http://10.0.x.x:8080 http_proxy: http://10.0.x.x:8080 no_proxy: "{{no_proxy}}"

Cluster

PROVIDER: custom CLUSTER_NAME: k8cluster NAMESPACE: viyans

Requirememts, extra added

V4_CFG_CADENCE_VERSION: "2023.10" V4_CFG_CADENCE_NAME: "lts" V4_CFG_DEPLOYMENT_ASSETS: "xxxxxxx4_deploymentAssets_2024-04-26T080944.tgz" V4_CFG_CERTS: "/sas-deployment/deploy/sandbox/assets/SASViyaV4_xxxx_certs.zip" V4_CFG_ORDER_NUMBER: "xxxx"

MISC

DEPLOY: true # Set to false to stop at generating the manifest LOADBALANCER_SOURCE_RANGES: ['xxx.x/32','xxxx/32']

Jump

JUMP_SVR_HOST: 10.0.x.x JUMP_SVR_USER: myuserid

RWX Filestore

V4_CFG_RWX_FILESTORE_ENDPOINT: 10.0.x.x V4_CFG_RWX_FILESTORE_PATH: /export

Storage

V4_CFG_MANAGE_STORAGE: true

SAS API Access

V4_CFG_SAS_API_KEY: "xxxxxxxxxx" V4_CFG_SAS_API_SECRET: "xxxxxxxxxxxxxxx"

CR Access

V4_CFG_CR_USER:docker-user V4_CFG_CR_PASSWORD: xxxxxxxxx V4_CFG_CR_URL: https://internalcontainerjfrogrepo

Ingress

V4_CFG_INGRESS_TYPE: "ingress" V4_CFG_INGRESS_FQDN: "viyaserver.dqdn"

V4_CFG_TLS_MODE: full-stack # [full-stack|front-door|ingress-only|disabled]

V4_CFG_INGRESS_MODE: public

INGRESS_NGINX_CONFIG:

controller:

service:

externalTrafficPolicy: Cluster

V4_CFG_POSTGRES_SERVERS: default: internal: false admin: postgres password: "mypasswd" fqdn: exteralpostrges server_port: 5432 ssl_enforcement_enabled: false database: SharedServices

LDAP

V4_CFG_EMBEDDED_LDAP_ENABLE: true

Consul UI

V4_CFG_CONSUL_ENABLE_LOADBALANCER: false

SAS/CONNECT

V4_CFG_CONNECT_ENABLE_LOADBALANCER: false

TLS

V4_CFG_TLS_GENERATOR: "cert-manager" V4_CFG_TLS_CERT: "/sas-deployment/deploy/sandbox/viya4-deployment/ingresscert.crt" V4_CFG_TLS_KEY: "/sas-deployment/deploy/sandbox/viya4-deployment/ingresscert.key" V4_CFG_TLS_TRUSTED_CA_CERTS: "/sas-deployment/deploy/sandbox/assets/D0_RootCA-_G01.pem"

CAS MPP Settings

V4_CFG_CAS_WORKER_COUNT: 3

Monitoring and Logging

uncomment and update the below values when deploying the viya4-monitoring-kubernetes stack

V4M_BASE_DOMAIN:

Viya Start and Stop Schedule

uncomment and update the values below with CronJob schedule expressions if you would

like to start and stop your Viya Deployment on a schedule

V4_CFG_VIYA_START_SCHEDULE: "0 7 1-5"

V4_CFG_VIYA_STOP_SCHEDULE: "0 19 1-5"

miaeyg commented 2 months ago

Hi @SASCloudLearner ,

Here's what I think is happening. I am not 100% sure perhaps others can chime in.

Since you provided values for V4_CFG_DEPLOYMENT_ASSETS and V4_CFG_CERTS parameters so it means you downloaded them yourself before running this project but you did not provide value for V4_CFG_LICENSE parameter then the project tries to download the license file by itself (using V4_CFG_SAS_API_KEY and V4_CFG_SAS_API_SECRET that you provided) as explained here: https://github.com/sassoftware/viya4-deployment/blob/main/docs/CONFIG-VARS.md#sas-software-order and probably then the proxy error happens.

Can you try to manually download the license file as well and point to it using the V4_CFG_LICENSE parameter like you pointed to the assets file and the certs file? Note that if you run the project using Docker, you need to read this as well: https://github.com/sassoftware/viya4-deployment/blob/main/docs/user/DockerVolumeMounts.md

Update 1: Examine the contents of this file in the deployment folder: "site-config/cr_access.json" does it point to JFrog correctly? You should have this file generated for you. Do you also have in "kustomization.yaml" a reference to a file named "sas-image-pull-secrets.yaml"?

Update 2: Verify you have in your "kustomization.yaml" reference to a file named "mirror.yaml" and that this file contains correct references to your JFrog CR. I assume you mirrored SAS LTS 2023.10 to JFrog using the SAS Mirror Manager utility, right?

jarpat commented 2 months ago

Hey @SASCloudLearner,

Just to verify, you stated your cluster was behind a proxy and it looks like connection is blocked to external hosts? You can verify if this is true by whether or not the baseline,install tasks ran successfully since those actions have tasks that install a some helm charts.

The deployment operator pod will download the entitlements from ses.sas.download during it's execution unless the "repositoryWarehouse" URL is specified in the sasdeployment.yaml file to get it from elsewhere. https://documentation.sas.com/?cdcId=itopscdc&cdcVersion=default&docsetId=dplyml0phy0dkr&docsetTarget=p0uytz3rj1l4ysn1e8n73u95stqu.htm#p0x9m01w44xo9cn13xsxqttwmtj2

We have an open feature request, #372 to add support for this repositoryWarehouse option to better support theses dark site/air gapped type deployments

Also thanks @miaeyg for providing some additional areas to look into to debug this issue.

miaeyg commented 2 months ago

Hi @jarpat I believe you are correct, I forgot about this... I recall the workaround is to avoid using the deployment operator by setting:

V4_DEPLOYMENT_OPERATOR_ENABLED: false

I also used:

DEPLOY: false

And then I deployed manually using "kubectl". This was done with the advice of Josh Coburn!

SASCloudLearner commented 2 months ago

Hi @SASCloudLearner ,

Here's what I think is happening. I am not 100% sure perhaps others can chime in.

Since you provided values for V4_CFG_DEPLOYMENT_ASSETS and V4_CFG_CERTS parameters so it means you downloaded them yourself before running this project but you did not provide value for V4_CFG_LICENSE parameter then the project tries to download the license file by itself (using V4_CFG_SAS_API_KEY and V4_CFG_SAS_API_SECRET that you provided) as explained here: https://github.com/sassoftware/viya4-deployment/blob/main/docs/CONFIG-VARS.md#sas-software-order and probably then the proxy error happens.

Can you try to manually download the license file as well and point to it using the V4_CFG_LICENSE parameter like you pointed to the assets file and the certs file? Note that if you run the project using Docker, you need to read this as well: https://github.com/sassoftware/viya4-deployment/blob/main/docs/user/DockerVolumeMounts.md

Update 1: Examine the contents of this file in the deployment folder: "site-config/cr_access.json" does it point to JFrog correctly? You should have this file generated for you. Do you also have in "kustomization.yaml" a reference to a file named "sas-image-pull-secrets.yaml"?

Update 2: Verify you have in your "kustomization.yaml" reference to a file named "mirror.yaml" and that this file contains correct references to your JFrog CR. I assume you mirrored SAS LTS 2023.10 to JFrog using the SAS Mirror Manager utility, right?

Hi @SASCloudLearner ,

Here's what I think is happening. I am not 100% sure perhaps others can chime in.

Since you provided values for V4_CFG_DEPLOYMENT_ASSETS and V4_CFG_CERTS parameters so it means you downloaded them yourself before running this project but you did not provide value for V4_CFG_LICENSE parameter then the project tries to download the license file by itself (using V4_CFG_SAS_API_KEY and V4_CFG_SAS_API_SECRET that you provided) as explained here: https://github.com/sassoftware/viya4-deployment/blob/main/docs/CONFIG-VARS.md#sas-software-order and probably then the proxy error happens.

Can you try to manually download the license file as well and point to it using the V4_CFG_LICENSE parameter like you pointed to the assets file and the certs file? Note that if you run the project using Docker, you need to read this as well: https://github.com/sassoftware/viya4-deployment/blob/main/docs/user/DockerVolumeMounts.md

Update 1: Examine the contents of this file in the deployment folder: "site-config/cr_access.json" does it point to JFrog correctly? You should have this file generated for you. Do you also have in "kustomization.yaml" a reference to a file named "sas-image-pull-secrets.yaml"?

Update 2: Verify you have in your "kustomization.yaml" reference to a file named "mirror.yaml" and that this file contains correct references to your JFrog CR. I assume you mirrored SAS LTS 2023.10 to JFrog using the SAS Mirror Manager utility, right?

Hi @jarpat I believe you are correct, I forgot about this... I recall the workaround is to avoid using the deployment operator by setting:

V4_DEPLOYMENT_OPERATOR_ENABLED: false

I also used:

DEPLOY: false

And then I deployed manually using "kubectl". This was done with the advice of Josh Coburn!

Hi Eyal, I already included the V4_CFG_LICENSE yesterday but still had the same issue. Also mirror.yaml/ sas-image-pull-secrets.yaml are included in kustomization.yaml. And the cr_access.json details looks right as well. Last week, i did try to run the kubectl commands manually as the kustomization.yaml looked complete and the deployment only fails during re-conciling and saw that pods/services etc were created. we had some other issues with ingress (not able to pick external ip) and other things so could not validate that deployment successfully and also was not sure if that would be missing. So from your experience, running the kubectl commands against the cluster-wide/cluter-local/namespace works fine? or Is there any additional things to note/ perform?

Thanks Again for your reponse.

Thanks, Raghu

SASCloudLearner commented 2 months ago

Hey @SASCloudLearner,

Just to verify, you stated your cluster was behind a proxy and it looks like connection is blocked to external hosts? You can verify if this is true by whether or not the baseline,install tasks ran successfully since those actions have tasks that install a some helm charts.

The deployment operator pod will download the entitlements from ses.sas.download during it's execution unless the "repositoryWarehouse" URL is specified in the sasdeployment.yaml file to get it from elsewhere. https://documentation.sas.com/?cdcId=itopscdc&cdcVersion=default&docsetId=dplyml0phy0dkr&docsetTarget=p0uytz3rj1l4ysn1e8n73u95stqu.htm#p0x9m01w44xo9cn13xsxqttwmtj2

We have an open feature request, #372 to add support for this repositoryWarehouse option to better support theses dark site/air gapped type deployments

Also thanks @miaeyg for providing some additional areas to look into to debug this issue.

Hi @jarpat, Thats right. we have a proxy for cluster to be able to access the internet. when we set the http_proxy, https_proxy, that seems to help while downloading the sas images from sas container register and push it to jfrog using mirrormgr. But we could not figure out how to include the proxy details in anisble-vars.yaml or other yaml files so deployment operator could take those values during recociling process. And as the deployment.yaml is only created after we do the viya installation through ansible, it replaces i believe the file if we make changes and run the ansible-playbook again unless we use kubectl commands.

Thank you for reference to the open feature. I will keep an eye on that, incase there is any update as this option would be really helpful while deploying at most of the customer location where access to internet is limited.

Thanks, Raghu

miaeyg commented 2 months ago

Hi @SASCloudLearner

Have you tried to add/modify your ansible-vars.yaml to include these two lines?

V4_DEPLOYMENT_OPERATOR_ENABLED: false DEPLOY: false

If not, please try and re-run the "viya,install" again pointing to a new folder so it will create new kustomization.yaml and then try to install this kustomization.yaml manually with kubectl commands using the instructions here for LTS 2023.10: https://documentation.sas.com/doc/en/itopscdc/v_045/dplyml0phy0dkr/p127f6y30iimr6n17x2xe9vlt54q.htm#p0n0x0jvog312an1wggpgnam1jsw .

I suggest you also delete all created objects in the K8S cluster and also the SAS Viya namespace so you start fresh. This worked for me in an air gapped deploy in AWS with ECR used as the Container Registry (equivalent of your JFrog).

I hope this helps.

SASCloudLearner commented 2 months ago

Hi @miaeyg I tried this way and seems like its working. But i cannot confirm as the firewall exception to ses.sas.download has been made. I will have to ask them to turn it off but for now they are busy with our other issue on the cluster. This is the new error. seems like some issue with either kube-vip/ coredns/kubernetes api server/ calico etc. try to fix this. Thanks again for the SAS variables though as that seems to be work. Error from server (InternalError): error when creating "/work/deploy/manifest.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s": Unknown Host

jarpat commented 2 months ago

Closing as duplicate of #372, repositoryWarehouse work will be tracked there.