Closed dasxx17 closed 3 years ago
CC @AlanGreene (have not tested with the latest PR, not sure the quickest way to get the API updates from HEAD without building it)
@dasxx17 there have been a few changes to external logs since v0.18. You could try the latest nightly build which includes the fix for the open/download links: https://storage.googleapis.com/tekton-releases-nightly/dashboard/latest/tekton-dashboard-release.yaml
Let me know if that resolves the issue for you. I'll most likely be releasing v0.20 on Monday so that build should be pretty close to what will be included.
There was an issue in at least one scenario where v0.19 wasn't correctly proxying requests to the external logs service that has since been fixed.
v0.20 is out now. I re-tested the external logs walkthrough in a clean environment with Dashboard v0.20 and it's working as expected. 🤞
If the new release doesn't resolve the problem for you we can reopen this issue.
@AlanGreene thanks for the reply. I applied v0.20 and deleted my entire tools namespace. I then redid the entire walkthrough verbatim (with the exception of modifying the ACCESSKEY and SECRETKEY for the S3 bucket) and am still unable to both download the log falls and have the logs fallback successfully. All my log pods are still healthy as well as my tekton pods, and I am able to view the log files (just not the content when I download them)in the minio UI. Please let me know if there is any other info I can provide to help with this (k8s cluster version is 1.18, not sure which version you are testing on). Thank you!
I've been testing on IKS (1.20), OpenShift CodeReady Containers(OpenShift 4.7), and kind (Kubernetes 1.20 + 1.21)
I'll run through the walkthrough again on a new kind cluster with 1.18 and see how it goes
/reopen
@AlanGreene: Reopened this issue.
I've just run through the walkthrough with Kubernetes 1.18 and I had to make a few changes to the ingress definitions to work with that version. I'll update the docs to mention the minimum/recommended version for the walkthroughs.
Here are the steps I modified to match your versions:
kind create cluster --name walkthrough --config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node:v1.18.19@sha256:7af1492e19b3192a79f606e43c35fb741e520d195f96399284515f077b3b622c
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
extraPortMappings:
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
EOF
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.26.0/release.yaml
kubectl apply -f https://storage.googleapis.com/tekton-releases/dashboard/previous/v0.20.0/tekton-dashboard-release.yaml
networking.k8s.io/v1
is not recognised in Kubernetes 1.18, revert to v1beta1
kubectl apply -n tekton-pipelines -f - <<EOF
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: tekton-dashboard
spec:
rules:
- host: tekton-dashboard.127.0.0.1.nip.io
http:
paths:
- backend:
serviceName: tekton-dashboard
servicePort: 9097
EOF
kubectl apply -n tools -f - <<EOF
kind: Service
apiVersion: v1
metadata:
name: logs-server
labels:
app: logs-server
spec:
ports:
- port: 3000
targetPort: 3000
selector:
app: logs-server
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: tekton-logs
spec:
rules:
- host: logs.127.0.0.1.nip.io
http:
paths:
- backend:
serviceName: logs-server
servicePort: 3000
EOF
With these changes it worked as expected, so I would start by checking that your ingresses are healthy and connected correctly.
Next, open your browser dev tools, switch to the network tab, then navigate to the logs view in the Dashboard.
You should see a request for the pod logs, e.g.:
GET http://tekton-dashboard.127.0.0.1.nip.io/api/v1/namespaces/tekton-pipelines/pods/sample-tgq2c-gen-log-gdffk-pod-n9hmr/log?container=step-gen-log&follow=true
If you have deleted the TaskRun Pod this should fail with a 404 response, and you should see a subsequent request to the Dashboard external logs proxy:
GET http://tekton-dashboard.127.0.0.1.nip.io/v1/logs-proxy/tekton-pipelines/sample-tgq2c-gen-log-gdffk-pod-n9hmr/step-gen-log
This request is backed by the logs service created in the walkthrough, you should be able to access it directly, e.g. for the request above:
http://logs.127.0.0.1.nip.io/logs/tekton-pipelines/sample-tgq2c-gen-log-gdffk-pod-n9hmr/step-gen-log
Thanks for the very detailed reply. I am suspecting that there is indeed some issue with my Ingress, as I am not able to see the Logs Server with my host name. I received a 502 and 404 for both the requests to the Dashboard external logs proxy following what you said after deleting the pod:
https://my-host-name/v1/logs-proxy/tekton-pipelines/sample-rqgvk-gen-log-fls7q-pod-svh7r/step-gen-log
404:
https://my-host-name/api/v1/namespaces/tekton-pipelines/pods/sample-rqgvk-gen-log-fls7q-pod-svh7r/log?container=step-gen-log-2&follow=true
The only difference is that I am not able to access the endpoint directly.
I applied an ALB Ingress Controller for my Ingress, it is defined below:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: logging-ingress
annotations:
alb.ingress.kubernetes.io/scheme: internal
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS": 443}]'
alb.ingress.kubernetes.io/load-balancer-attributes: routing.http2.enabled=true
alb.ingress.kubernetes.io/success-codes: 404,200
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-2016-08
alb.ingress.kubernetes.io/certificate-arn: $CERT
alb.ingress.kubernetes.io/subnets: $SUBNETS
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
kubernetes.io/ingress.class: alb
spec:
rules:
- host: my-host-name
http:
paths:
- path: /*
backend:
serviceName: logs-server
servicePort: 3000
My service definition is the same as yours, except I tried changing the type to a NodePort instead of a ClusterIP as well. Also, I am noticing that my logs-server
pod keeps restarting with:
Normal Pulled 28m (x177 over 26h) kubelet Container image "node:14" already present on machine
Warning BackOff 4m15s (x861 over 26h) kubelet Back-off restarting failed container
However, the thing I am confused about is why I am unable to view the downloaded logs. The only deviation I have is my minio server is just running locally, but I should still be able to download the logs and view them even if they're running on localhost (I think).
TLDR: Seems like I am running into some painpoints with my Ingress mapping due to corporate environment. I don't think it should matter that you are running IKS, and I am running EKS, as the rest is the same.
Thank you so much for all your help. I will try to get the logs server up, but please let me know if you see some issue with me using an ALB Ingress Controller or if there is some other issue with my Ingress definition. I need to internally request a CNAME mapping with our corporate domain to get the host name mapping for the LB endpoint... however I am getting a 404 when I CURL the Load Balancer that is mapped to my logs-server
service, so perhaps there is something wrong there (or with the frequent restarts). Any other information would be super helpful. Thanks!
The restarting logs-server
is definitely unexpected, check the logs for that pod to see if there's any clue as to why it's continually failing. The 502 error response makes sense if the Dashboard is unable to proxy requests to the logs-server.
When you say your minio server is running on localhost do you mean on the host machine? How are you exposing it to services running in the cluster? Can you make a request directly from a pod in the cluster to your minio server and get a successful response, e.g. using curl
?
I wouldn't expect the ALB to make any difference here, although it's definitely worth checking each of your ingresses to ensure you can successfully connect to their respective services.
Closing as this doesn't seem to be related to the Dashboard itself or the walkthrough, but more a config / environment issue with a custom setup. If you've managed to resolve your issue and think it might be helpful to others a PR to update the walkthrough with a short note describing the problem/fix would be welcome.
Describe the bug
Not sure if this is related to the lack of the open/download links feature for external logs in v0.19.0, but when configuring the fallback I am not seeing the logs appear again in my dashboard. I have the external-logs flag configured as per the tutorial:
But I am still seeing the Unable to fetch logs message when browsing tasks. I have verified that the logs are indeed appearing in my S3 bucket Minio server. All the pods are looking healthy (operator, server, fluentbit, fluentd), with the exception of some restarts on the server pod Normal Pulled 61s kubelet Container image "node:14" already present on machine.
I do not believe this would impact it, but I did deploy an ALB Ingress Controller to expose the logs-server deployment. Thanks for your help!
Steps to reproduce the bug
Follow https://github.com/tektoncd/dashboard/blob/main/docs/walkthrough/walkthrough-logs.md
Environment details
Kubernetes version: 1.18 Cloud-provider/provisioner: EKS Versions: Tekton Dashboard: 0.18.1 Tekton Pipelines: 0.26.0 Tekton Triggers: 0.15.0 Install namespaces: Tekton Dashboard: tekton-pipelines Tekton Pipelines: tekton-pipelines Tekton Triggers: tekton-pipelines