Closed lorenzk1213 closed 10 months ago
It's telling you it cannot find nfs-homes or nfs-data Do you have an NFS server setup in your cluster? And if so did the IAC code base set this up or did your bring it yourself. If you have no NFS server what is your storage type and how did you map that information in your ansible vars file?
Hi @thpang
We are using EFS in this environment and it was provisioned separately not using the IAC code. We map it by setting V4_CFG_MANAGE_STORAGE: false in our ansible vars yaml file.
To add. one of the items we are wondering is where could this path be defined in the scripts:
our storage path is /viya-share/non-prod-dev-viya-ns however seems the deployment is reading it from //non-prod-non-prod-dev-viya-ns/viya-share/non-prod-dev-viya-ns
If you look in the terraform output
you can you'll see these entries for your filestore:
rwx_filestore_endpoint
and rwx_filestore_path
these are the values that are used by the pods to mount into your storage.
@thpang
This is the value from our terraform output for the 2 entries.
Seems rwx_filestore_path configured is "/" directory. But seems still different from the path the deployment reads which is //non-prod-non-prod-dev-viya-ns
Is it possible we have missed something?
Thanks,
Hello i have the same exact problem, but i haven't changed anything in terraform vars neither bring up custom nfs storage server. The output in terraform for rwx_file_store_endpoind is an IP inside the subnet of our Vnet and the rwx_file_store_path is /export What could the error be here?
@cceeci99 May I see the output of the command:
kubectl get pods -n
yeah i know with that command i got the exact same description, i think the problem is that the pods can't find the NFS storage to mount volumes
The // comes from the fact that a variable is not set which would include /viya-share so not quite sure what's a miss here. We never create paths with the // prefix.
I found the problem is that if you configure terraform to make its own network security group you must specify vm_public_access_cidrs so it can open an SSH port to the jumpuser, and when running the deployment with ansible, it will be able to connect to the jumpuser and make the directories that are needed. If it can't connect the ansible somehow manages to skip that error and doesn't create the directories but create all the deployments and that's why the sas-cas-controller could not mount the volumes
That is correct if there is no access to the vm's from ansible it will simply skip those steps without erroring. It's guessing you've handled those directories outside of the DAC code base. Did you set the default_public_access_cidr value or did you leave this empty?
@cceeci99 @thpang
the default_public_access_cidr in our IAC .tfvars file is empty. Is it required for this to have a value?
Also we did not configure the IAC to create a jumpbox as this was already provided by customer separately.
@lorenzk1213 have this as default meaning empty, because I do not want my resources to be accessed by any other IPs, as I have a VPN In my resource group and everything in the vnet will have access to them
Marking as stale/inactive. If there are further questions please open a new GitHub issue.
After running the deployment to install Viya 4, All pods seems to be running fine
except for sas-cas-server-default which is in Init Status
This is seen in the describe output
Appreciate any suggestions.
Thanks,