cookiecutter-openedx / cookiecutter-openedx-devops

Open edX Tutor on Kubernetes implemented with Terraform
GNU Affero General Public License v3.0
41 stars 16 forks source link

resource "kubectl_manifest" #21

Closed melsu90 closed 1 year ago

melsu90 commented 2 years ago

Describe the bug A clear and concise description of what the bug is.

This terraform doesn't work for me no matter what I try. the issue is around resource "kubectl_manifest"

Workflow If applicable, provide a workflow file to help explain your problem.

I have set up the environment I have ubuntu 20.04, python 3.8, terraform 1.2.8, terragrunt 0.38 coockiecutter 1.17.0. venv I have been able to initialize all the modules and apply VPC sucessfully. However when i run apply for kubernetes i get stuck. (6 times i tried)

Additional context Add any other context about the problem here.

please find the error log as follows

╷ │ Error: default failed to create kubernetes rest client for update of resource: Get "https://localhost/api?timeout=32s": remote error: tls: internal error │ │ with kubectl_manifest.karpenter_provisioner, │ on addon_karpenter.tf line 103, in resource "kubectl_manifest" "karpenter_provisioner": │ 103: resource "kubectl_manifest" "karpenter_provisioner" { │ ╵ ╷ │ Error: karpenter/vpa-recommender-karpenter failed to create kubernetes rest client for update of resource: Get "https://localhost/api?timeout=32s": remote error: tls: internal error │ │ with kubectl_manifest.vpa-karpenter, │ on addon_karpenter.tf line 167, in resource "kubectl_manifest" "vpa-karpenter": │ 167: resource "kubectl_manifest" "vpa-karpenter" { │ ╵ ╷ │ Error: metrics-server/vpa-recommender-metrics-server failed to create kubernetes rest client for update of resource: Get "https://localhost/api?timeout=32s": remote error: tls: internal error │ │ with kubectl_manifest.vpa-metrics-server, │ on addon_metrics-server.tf line 40, in resource "kubectl_manifest" "vpa-metrics-server": │ 40: resource "kubectl_manifest" "vpa-metrics-server" { │ ╵ ╷ │ Error: monitoring/vpa-recommender-prometheus-kube-state-metrics failed to create kubernetes rest client for update of resource: Get "https://localhost/api?timeout=32s": remote error: tls: internal error │ │ with kubectl_manifest.vpa-prometheus-kube-state-metrics, │ on addon_prometheus.tf line 49, in resource "kubectl_manifest" "vpa-prometheus-kube-state-metrics": │ 49: resource "kubectl_manifest" "vpa-prometheus-kube-state-metrics" { │ ╵ ╷ │ Error: monitoring/vpa-recommender-prometheus-grafana failed to create kubernetes rest client for update of resource: Get "https://localhost/api?timeout=32s": remote error: tls: internal error │ │ with kubectl_manifest.vpa-prometheus-grafana, │ on addon_prometheus.tf line 58, in resource "kubectl_manifest" "vpa-prometheus-grafana": │ 58: resource "kubectl_manifest" "vpa-prometheus-grafana" { │ ╵ ╷ │ Error: monitoring/vpa-recommender-prometheus-operator failed to create kubernetes rest client for update of resource: Get "https://localhost/api?timeout=32s": remote error: tls: internal error │ │ with kubectl_manifest.vpa-prometheus-operator, │ on addon_prometheus.tf line 68, in resource "kubectl_manifest" "vpa-prometheus-operator": │ 68: resource "kubectl_manifest" "vpa-prometheus-operator" { │ ╵ ERRO[0035] Terraform invocation failed in /home/ubuntu/openedx-devops/terraform/stacks/live/kubernetes/.terragrunt-cache/x70M823Sr1V43YpPRhegqWqdxgo/1jDdKQp5TPKosylO883p-p1Fh3o/kubernetes prefix=[/home/ubuntu/openedx-devops/terraform/stacks/live/kubernetes] ERRO[0035] Module /home/ubuntu/openedx-devops/terraform/stacks/live/kubernetes has finished with an error: 1 error occurred:

Could you please help me realize where am i going wrong??

lpm0073 commented 2 years ago

Each of the errors are coming from calls to kubectl. at a glance it appears that kubectl might not be installed on the machine on which you're executing the Terraform scripts. I'd suggest that you first verify that kubectl is installed and correctly configured to connect to your Kubernetes cluster. If so then you'll be able to run a command such as kubectl get pods from your command line, and you'll see results. Once kubectl is working you should be good to go.

lpm0073 commented 2 years ago

Also, just in case you didn't already know this: you can automatically configure kubectl using a aws-cli helper function, as follows:

aws eks --region us-east-1 update-kubeconfig --name THE-NAME-OF-YOUR-CLUSTER

of course, changing the --region flag to match the AWS data center region in which your AWS EKS cluster is implemented.

melsu90 commented 1 year ago

Thanks for the heads up. That was the exact problem which was solved by updating kubeconfig file.

after resolving this issue I was finally able to run the all terraform/terragrunt and spin up my environments.

in the process I noticed below error while apply command for redis

╷ │ Warning: Attribute Deprecated │ │ with module.redis.random_string.id, │ on modules/elasticache/main.tf line 19, in resource "random_string" "id": │ 19: number = false │ │ NOTE: This is deprecated, use numeric instead. │ │ (and one more similar warning elsewhere) ╵ since its warning and not error and we have used number for multiple resources I am assuming its safe to ignore. however subsequent plan/apply command I noticed

module.redis.module.elasticache_parameter_group.aws_elasticache_parameter_group.this[0] has changed

~ resource "aws_elasticache_parameter_group" "this" { id = "abcxyz" name = "abcxyz"

this is also the case with one more resource as follows

Terraform will perform the following actions:

module.eks.aws_security_group_rule.node["port_8443"] will be updated in-place

~ resource "aws_security_group_rule" "node" {

is this normal/expected behaviour??

further I build openedx and options without credentials or licence manager. while deploying it fails and mongo db.

if I deploy credentials it fails in deploying and it also fails to build licence manager. also one of licence manager has stepwisemath hardcoded

how can we tackle all these issues and test this project?

melsu90 commented 1 year ago

regarding mongodb related issue what happens is

  - name: Configure MongoDB
    uses: openedx-actions/tutor-k8s-configure-mongodb@v1.0.1
    with:
      namespace: ${{ env.NAMESPACE }}
      remote-server: "false"

lets say namespace here is example-global-prod

this depends on openedx-actions/tutor-k8s-configure-mongodb which intern depends on openedx-actions/tutor-k8s-get-secret which requires eks-namespace and the same (i.e. example-global-prod) is provided as namespace however my terraform output suggests

data.aws_eks_cluster.eks: Read complete after 1s [id=example-global-live]

could this be the issue?

lpm0073 commented 1 year ago

i'll try to answer all of your questions here.

  1. terraform screen output: everything in your screen shots is "normal" output. the deprecation warning regards a low-level module that this Terraform code references. the module is maintained by aws, so i'm assuming that they'll eventually take care of it. the other messages are completely normal Terraform output.
  2. the Credentials and License Manager options are actively under development. you should consider both of these to be experimental add-ons. I would avoid both of these if you're just trying to get a platform up and running. I can confirm that the License Manager build also failed for me last night but i've yet to begin any trouble shooting. Keep an eye on the version of the openedx-actions repos -- any with a version 0.0.x is under development whereas production-ready modules have version of at least 1.x.x.
  3. MongoDB installation problems. in your example, "example-global-live" is a stack namespace whereas "example-global-prod" is an environment namespace. MongoDB (at least, the server itself) is part of your infrastructure stack and thus, should exist in the "example-global-live" namespace. Having said that, keep in mind that there are also environment-specific MongoDB settings that Terraform creates and so you'll find MongoDB terraform-generated resources in both of these namespaces. Additionally, just a general comment about the remote MongoDB server option. It works well, however, the Terraform code is very brittle and I personally struggle with making even the most minor modifications to it. For getting started, you might consider avoiding this option in favor of using the default Tutor-created mongodb Kubernetes pod simply because it's an easier installation.

Lastly, a general comment about this Cookiecutter project: the main branch of this repo contains bleeding edge features like for example presently, License Manager and Certificates, which are not assured to work as expected in production environments. On the other hand, the named version releases located here contain fully tested functionality.