open-cluster-management-io / community

open-cluster-management governance material.
https://open-cluster-management.io
Apache License 2.0
61 stars 23 forks source link

Introducing Open-Tracing into Kubernetes multi-cluster operators #154

Closed yue9944882 closed 4 months ago

yue9944882 commented 2 years ago

Background

About: Open-Cluster-Management

Open Cluster Management (OCM) is a community-driven project focused on multicluster and multicloud scenarios for Kubernetes apps. Open APIs are evolving within this project for cluster registration, work distribution, dynamic placement of policies and workloads, and much more.

About: Distributed-Tracing/Open-Telemetry

Distributed tracing is a method of observing requests as they propagate through distributed cloud environments. Distributed tracing follows an interaction by tagging it with a unique identifier. This identifier stays with the transaction as it interacts with microservices, containers, and infrastructure.

OpenTelemetry is a collection of tools, APIs, and SDKs. Use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) to help you analyze your software’s performance and behavior.

Project

Introducing Open-Tracing into Kubernetes multi-cluster operators

The concept of distrubted tracing is orginated from micro-service frameworks. The developers can plumb "trace-context" into the requests flowing between the service instances, and these services will upload their activities in the form of "trace" or "span" to a centralized trace storage. As a result, the administrator will be able to visualize not only the service topology graph but also the fine-grained stats such as the time cost from each instances, etc.

Likely in this project, we will bring distributed tracing to a scenario where multiple Kubernetes clusters are working in a "hub-spoke" paradigm --- one "hub" Kubernetes cluster being the overall supervisor sending tasks/prescriptions towards the other workload "spoke" clusters. Inside each Kubernetes cluster, there can be one or multiple "operator" which is a controller subscribes events from the cluster and keeps reconciling/processing corresponding resources upon notifications. Similar to the request tracing, we will inject the "trace-context" into the workflow of the Kubernetes operators across the clusters, so each time the operator reconciles a resource, it will also upload a "trace" record to the remote storage. Then the overall administrator in the "hub" cluster will be able to clearly visualize the interaction flow happening between the clusters.

In our project, we will start by introducing distributed tracing to the core components of OCM by leveraging open-telemetry which is both a specification and an implementation of the technique. The open-telemetry agents (or "collectors") should be automatically installed into the OCM environment as an addon, and these agents will collect the activities from OCM components and push them up to the remote storage.

yhx-coder commented 2 years ago

请问只能使用go完成吗

qiujian16 commented 4 months ago

this is done with an addon in addon-contrib repo.

/close

openshift-ci[bot] commented 4 months ago

@qiujian16: Closing this issue.

In response to [this](https://github.com/open-cluster-management-io/community/issues/154#issuecomment-2003846445): >this is done with an addon in addon-contrib repo. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.