ForgeRock / forgeops

ForgeRock platform assets for Kubernetes deployment. Contains the files you need to build your own Docker images and to deploy the ForgeRock Identity Platform on Kubernetes clusters.
Other
165 stars 243 forks source link

installing cdk on aws eks #679

Open sandeepk24 opened 3 months ago

sandeepk24 commented 3 months ago

I clone the repo and check out to 7.3 as per the backstage document(git checkout release/7.3-20240131)(https://backstage.forgerock.com/docs/forgeops/7.3/forgeops.html). Most of the directories like charts and helm disappear. But when I clone only the master I see all the files. Could you please suggest as to what I should clone. I also when I try to install ingress in eks cluster the ingress pods dont come up. When I install the ds using forgeops it complains about the pv and pvc not available. And when I try to install IG using forgeops the pod does not come up either. Could you please help? I can provide whatever is needed for debugging.

stenolan1 commented 3 months ago

Hello Sandeep, Are you following the guide for aws eks here? https://backstage.forgerock.com/docs/forgeops/7.3/cdk/cloud/setup/eks/forgeops.html The cloning itself should be git clone https://github.com/ForgeRock/forgeops.git regardless of which branch you check out The branch that you check out for 7.3 should be git checkout release/7.3-20240131 Make sure you obtain details about your eks cluster according to the details here https://backstage.forgerock.com/docs/forgeops/7.3/cdk/cloud/setup/eks/clusterinfo.html

Steve Nolan

sandeepk24 commented 3 months ago

Thank you Steve for getting back so promptly! I am following this links provided by the backstage same as the ones you mentioned. When I clone the repo from Master I get all the files but when I do the git checkout release/7.3-20240131 to the this branch I lost all the data as you can see in the screenshot below: Let me know if I am doing anything wrong. Let me know if I can schedule a call to see if I can articulate it more clearly? Facing few other issues such as not able to bring up ds and ig pods up.

Thanks, Sandeep

forgerock_master
lee-baines commented 3 months ago

Hi @sandeepk24. Ignore the difference in the commits. Thats just because master is equivalent to 7.5(unreleased) compared to 7.3. So there are significant differences between the 2. Also, Helm charts were introduced in 7.4 hence why the charts folder isn't there in 7.3. You just need to strictly follow the docs for 7.3 only. I've checked that branch and all the directories are correct

lee-baines commented 3 months ago

Regarding DS, you need to ensure that you have the correct storage class available so the PVC can be correctly provisioned. So for EKS you need to apply the following:

createStorageClasses() {
    kubectl create -f - <<EOF
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
    name: fast
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
    name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
    type: gp2
EOF
sandeepk24 commented 3 months ago

Thank you @lee-baines for getting back. That information is helpful. We tried to install ingress and were not able to as the shell script was using helm to build it. So we planning on using AWS ALB controller instead of nginx ingress controller. Do you have any solution for ingress?

lee-baines commented 3 months ago

Hi @sandeepk24. Why can't you use Helm? Regardless, if you use an AWS ALB, then you'll need to set the correct annotation on the ingress.yaml:

annotations:
    kubernetes.io/ingress.class: alb

Beyond that, I haven't configured an ALB in 7 years :). So you'll have to look at the docs. They key consideration is that nginx offloads SSL inside the cluster. With an ALB, you'll offload SSL at the ALB load balancer, so traffic between the load balancer and the cluster will be unencrypted. We do have some ongoing work to address this but it's still in progress.

@paulbsch any more considerations for ALBs?

sandeepk24 commented 3 months ago

Thank you @lee-baines that helps. However when I run the forgeops install command I am facing an issue:

./forgeops install ig --mini --deploy-env test --config-profile test -n iam-test --fqdn removed-this --debug Flag --short has been deprecated, and will be removed in the future. deployment manifest path: kustomize/"/data/forgeops/bin/../kustomize/deploy-test" [DEBUG] Running: "kubectl version --client=true -o json" [DEBUG] Running: "kubectl version -o json" [DEBUG] Running: "kustomize version --short" Flag --short has been deprecated, and will be removed in the future. Checking cert-manager and related CRDs: [DEBUG] Running: "kubectl get crd certificaterequests.cert-manager.io" [DEBUG] Running: "kubectl get crd certificates.cert-manager.io" [DEBUG] Running: "kubectl get crd clusterissuers.cert-manager.io" cert-manager CRD found in cluster. [DEBUG] Running: "kubectl -n cert-manager get deployment cert-manager -o jsonpath={.spec.template.spec.containers[0].image}" Checking secret-agent operator and related CRDs: [DEBUG] Running: "kubectl get crd secretagentconfigurations.secret-agent.secrets.forgerock.io" secret-agent CRD found in cluster.

Checking secret-agent operator is running... [DEBUG] Running: "kubectl wait --for=condition=Established crd secretagentconfigurations.secret-agent.secrets.forgerock.io --timeout=30s" customresourcedefinition.apiextensions.k8s.io/secretagentconfigurations.secret-agent.secrets.forgerock.io condition met [DEBUG] Running: "kubectl -n secret-agent-system wait --for=condition=available deployment --all --timeout=120s" deployment.apps/secret-agent-controller-manager condition met [DEBUG] Running: "kubectl -n secret-agent-system get pod -l app.kubernetes.io/name=secret-agent-manager --field-selector=status.phase==Running" NAME READY STATUS RESTARTS AGE secret-agent-controller-manager-74f6b575b8-tkztc 2/2 Running 2 (11d ago) 12d secret-agent operator is running [DEBUG] Running: "kubectl -n secret-agent-system get deployment secret-agent-controller-manager -o jsonpath={.spec.template.spec.containers[0].image}" Checking ds-operator and related CRDs: [DEBUG] Running: "kubectl get crd directoryservices.directory.forgerock.io" ds-operator CRD found in cluster. [DEBUG] Running: "kubectl -n fr-system get deployment ds-operator-ds-operator -o jsonpath={.spec.template.spec.containers[0].image}" Traceback (most recent call last): File "./forgeops", line 431, in main() File "./forgeops", line 410, in main utils.install_dependencies(args.legacy) File "/data/forgeops/bin/utils.py", line 675, in installdependencies , img, _ = run('kubectl', f'-n fr-system get deployment ds-operator-ds-operator -o jsonpath={{.spec.template.spec.containers[0].image}}', File "/data/forgeops/bin/utils.py", line 229, in run raise(e) File "/data/forgeops/bin/utils.py", line 223, in run _r = subprocess.run(shlex.split(runcmd), stdout=stdo_pipe, stderr=stde_pipe, File "/usr/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['kubectl', '-n', 'fr-system', 'get', 'deployment', 'ds-operator-ds-operator', '-o', 'jsonpath={.spec.template.spec.containers[0].image}']' returned non-zero exit status 1.

lee-baines commented 3 months ago

Are you trying to just install IG?

sandeepk24 commented 3 months ago

Yes for now only IG. Running this on an AWS EKS cluster and trying to install mini for now. Tried adding all the roles and permissions mentioned in the document.

sandeepk24 commented 3 months ago

@lee-baines I ran with all components and i ran into cert manager git hub repo fails, unable to install cert-manager. And also the utils.py is also failing.

./forgeops install --mini --deploy-env test --config-profile test -n iam-test --fqdn --debug Could not verify Kubernetes server version. Continuing for now. Flag --short has been deprecated, and will be removed in the future. deployment manifest path: kustomize/"/data/forgeops/bin/../kustomize/deploy-test" [DEBUG] Running: "kubectl version --client=true -o json" [DEBUG] Running: "kubectl version -o json" Could not verify Kubernetes server version. Continuing for now. [DEBUG] Running: "kustomize version --short" Flag --short has been deprecated, and will be removed in the future. Checking cert-manager and related CRDs: [DEBUG] Running: "kubectl get crd certificaterequests.cert-manager.io" cert-manager CRD not found. Installing cert-manager. [DEBUG] Running: "kubectl apply -f https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.crds.yaml " error: error validating "https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.crds.yaml": error validating data: failed to download openapi: the server has asked for the client to provide credentials; if you choose to ignore these errors, turn validation off with --validate=false Traceback (most recent call last): File "/data/forgeops/bin/utils.py", line 623, in install_dependencies run('kubectl', 'get crd certificaterequests.cert-manager.io', File "/data/forgeops/bin/utils.py", line 229, in run raise(e) File "/data/forgeops/bin/utils.py", line 223, in run _r = subprocess.run(shlex.split(runcmd), stdout=stdo_pipe, stderr=stde_pipe, File "/usr/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['kubectl', 'get', 'crd', 'certificaterequests.cert-manager.io']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./forgeops", line 431, in main() File "./forgeops", line 410, in main utils.install_dependencies(args.legacy) File "/data/forgeops/bin/utils.py", line 631, in install_dependencies certmanager('apply', tag=REQ_VERSIONS['cert-manager']['DEFAULT']) File "/data/forgeops/bin/utils.py", line 746, in certmanager run('kubectl', File "/data/forgeops/bin/utils.py", line 229, in run raise(e) File "/data/forgeops/bin/utils.py", line 223, in run _r = subprocess.run(shlex.split(runcmd), stdout=stdo_pipe, stderr=stde_pipe, File "/usr/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command ''kubectl', 'apply', '-f', '[https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.crds.yaml']' returned non-zero exit status 1.

lee-baines commented 3 months ago

I see there is a related support ticket for this? I think this is related to your Kubernetes versions. Can you check you versions kubectl version. Your local version should be similar to your cluster version

lee-baines commented 3 months ago

I found this thread https://stackoverflow.com/questions/56803534/failed-to-download-openapi-error-with-kubernetes-deployment

sandeepk24 commented 3 months ago

./certmanager-deploy.sh "jetstack" already exists with the same configuration, skipping Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "ingress-nginx" chart repository ...Successfully got an update from the "jetstack" chart repository Update Complete. ⎈Happy Helming!⎈ customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io unchanged customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io unchanged Error: INSTALLATION FAILED: Unable to continue with install: ClusterRole "cert-manager-controller-certificates" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "cert-manager"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "cert-manager"

sandeepk24 commented 3 months ago

Hey @lee-baines seeing this error now:

Traceback (most recent call last): File "/data/forgeops/bin/./forgeops", line 431, in main() File "/data/forgeops/bin/./forgeops", line 410, in main utils.install_dependencies(args.legacy) File "/data/forgeops/bin/utils.py", line 675, in installdependencies , img, _ = run('kubectl', f'-n fr-system get deployment ds-operator-ds-operator -o jsonpath={{.spec.template.spec.containers[0].image}}', ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/forgeops/bin/utils.py", line 229, in run raise(e) File "/data/forgeops/bin/utils.py", line 223, in run _r = subprocess.run(shlex.split(runcmd), stdout=stdo_pipe, stderr=stde_pipe, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['kubectl', '-n', 'fr-system', 'get', 'deployment', 'ds-operator-ds-operator', '-o', 'jsonpath={.spec.template.spec.containers[0].image}']' returned non-zero exit status 1.

sandeepk24 commented 3 months ago

This got fixed once we ran the ds-operator.sh from the bin. But now seeing a bunch of app failure errors in ds, am and ig apps. Sent you the logs in the frg community.