Open MrFrankBolt opened 5 months ago
Hello, I'm very sorry. An error occurred during the unit conversion process of the data collected by k8s monitoring. You can temporarily use this template to monitor k8s,We will fix this issue soon.
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# The monitoring type category:service-application service monitoring db-database monitoring custom-custom monitoring os-operating system monitoring
category: cn
# The monitoring type eg: linux windows tomcat mysql aws...
app: kubernetes
# The monitoring i18n name
name:
zh-CN: Kubernetes
en-US: Kubernetes
# The description and help of this monitoring type
help:
zh-CN: HertzBeat 通过查询 Kubernetes ApiServer api 来对 kubernetes 的通用性能指标(nodes、namespaces、pods、services)进行采集监控。<br><span class='help_module_span'>注意⚠️:为了监控 Kubernetes 中的信息,则需要获取到可访问 Api Server 的授权 TOKEN,让采集请求获取到对应的信息,<a class='help_module_content' href='https://hertzbeat.apache.org/zh-cn/docs/help/kubernetes'>点击查看获取步骤</a>。</span>
en-US: HertzBeat monitoring Kubernetes general metrics such as nodes, namespaces and pods through querying data from Kubernetes ApiServer api. <br><span class='help_module_span'>Note⚠️:In order to monitor the information of Kubernetes, Hertzbeat need to obtain the authorized TOKEN that can access Api Server. <a class='help_module_content' href='https://hertzbeat.apache.org/docs/help/kubernetes'>Click here to view the specific steps.</a></span>
zh-TW: HertzBeat 通過查詢 Kubernetes ApiServer api 來對 kubernetes 的通用性能指標(nodes、namespaces、pods、services)進行采集監控。<br><span class='help_module_span'>注意⚠️:爲了監控 Kubernetes 中的信息,則需要獲取到可訪問 Api Server 的授權 TOKEN,讓采集請求獲取到對應的信息,<a class='help_module_content' href='https://hertzbeat.apache.org/zh-cn/docs/help/kubernetes'>點擊查看獲取步驟</a>。</span>
helpLink:
zh-CN: https://hertzbeat.apache.org/zh-cn/docs/help/kubernetes
en-US: https://hertzbeat.apache.org/docs/help/kubernetes
# Input params define for monitoring(render web ui by the definition)
params:
# field-param field key
- field: host
# name-param field display i18n name
name:
zh-CN: 目标Host
en-US: Target Host
# type-param field type(most mapping the html input type)
type: host
# required-true or false
required: true
# field-param field key
- field: port
# name-param field display i18n name
name:
zh-CN: ApiServer端口
en-US: ApiServer Port
# type-param field type(most mapping the html input type)
type: number
# when type is number, range is required
range: '[0,65535]'
# required-true or false
required: true
# default value
defaultValue: 6443
# field-param field key
- field: authType
# name-param field display i18n name
name:
zh-CN: 认证方式
en-US: Auth Type
# type-param field type(radio mapping the html radio tag)
type: radio
# required-true or false
required: true
# when type is radio checkbox, use option to show optional values {name1:value1,name2:value2}
options:
- label: Bearer Token
value: Bearer Token
defaultValue: Bearer Token
- field: token
name:
zh-CN: 认证Token
en-US: Access Token
type: text
required: true
# collect metrics config list
metrics:
# metrics - nodes
- name: nodes
# metrics scheduling priority(0->127)->(high->low), metrics with the same priority will be scheduled in parallel
# priority 0's metrics is availability metrics, it will be scheduled first, only availability metrics collect success will the scheduling continue
priority: 0
# collect metrics content
fields:
# field-metric name, type-metric type(0-number,1-string), unit-metric unit('%','ms','MB'), label-whether it is a metrics label field
- field: node_name
type: 1
i18n:
zh-CN: 节点名称
en-US: Node Name
- field: is_ready
type: 1
i18n:
zh-CN: 节点就绪状态
en-US: Node Ready Status
- field: capacity_cpu
type: 0
i18n:
zh-CN: CPU 容量
en-US: CPU Capacity
- field: allocatable_cpu
type: 0
i18n:
zh-CN: 可分配 CPU
en-US: Allocatable CPU
- field: capacity_memory
type: 1
i18n:
zh-CN: 内存容量
en-US: Memory Capacity
- field: allocatable_memory
type: 1
i18n:
zh-CN: 可分配内存
en-US: Allocatable Memory
- field: creation_time
type: 1
i18n:
zh-CN: 创建时间
en-US: Creation Time
# (optional)metrics field alias name, it is used as an alias field to map and convert the collected data and metrics field
aliasFields:
- $.metadata.name
- $.status.conditions[?(@.type=='Ready')].status
- $.status.capacity.cpu
- $.status.capacity.memory
- $.status.allocatable.cpu
- $.status.allocatable.memory
- $.metadata.creationTimestamp
# (optional)mapping and conversion expressions, use these and aliasField above to calculate metrics value
# eg: cores=core1+core2, usage=usage, waitTime=allTime-runningTime
calculates:
- node_name=$.metadata.name
- is_ready=$.status.conditions[?(@.type=='Ready')].status
- capacity_cpu=$.status.capacity.cpu
- allocatable_cpu=$.status.allocatable.cpu
- capacity_memory=$.status.capacity.memory
- allocatable_memory=$.status.allocatable.memory
- creation_time=$.metadata.creationTimestamp
# (optional)field unit mapping and conversion expressions, origin unit -> final unit
protocol: http
http:
host: ^_^host^_^
port: ^_^port^_^
url: /api/v1/nodes
method: GET
ssl: true
authorization:
type: ^_^authType^_^
bearerTokenToken: ^_^token^_^
parseType: jsonPath
parseScript: '$.items.*'
- name: namespaces
priority: 1
fields:
- field: namespace
type: 1
i18n:
zh-CN: 命名空间
en-US: Namespace
- field: status
type: 1
i18n:
zh-CN: 状态
en-US: Status
- field: creation_time
type: 1
i18n:
zh-CN: 创建时间
en-US: Creation Time
aliasFields:
- $.metadata.name
- $.status.phase
- $.metadata.creationTimestamp
calculates:
- namespace=$.metadata.name
- status=$.status.phase
- creation_time=$.metadata.creationTimestamp
protocol: http
http:
host: ^_^host^_^
port: ^_^port^_^
url: /api/v1/namespaces
method: GET
ssl: true
authorization:
type: ^_^authType^_^
bearerTokenToken: ^_^token^_^
parseType: jsonPath
parseScript: '$.items.*'
- name: pods
priority: 1
fields:
- field: pod
type: 1
i18n:
zh-CN: Pod名称
en-US: Pod Name
- field: namespace
type: 1
i18n:
zh-CN: 命名空间
en-US: Namespace
- field: status
type: 1
i18n:
zh-CN: 状态
en-US: Status
- field: restart
type: 1
i18n:
zh-CN: 重启次数
en-US: Restart Count
- field: host_ip
type: 1
i18n:
zh-CN: 主机IP
en-US: Host IP
- field: pod_ip
type: 1
i18n:
zh-CN: Pod IP
en-US: Pod IP
- field: creation_time
type: 1
i18n:
zh-CN: 创建时间
en-US: Creation Time
- field: start_time
type: 1
i18n:
zh-CN: 启动时间
en-US: Start Time
aliasFields:
- $.metadata.name
- $.metadata.namespace
- $.status.phase
- $.spec.restartPolicy
- $.status.hostIP
- $.status.podIP
- $.metadata.creationTimestamp
- $.status.startTime
calculates:
- pod=$.metadata.name
- namespace=$.metadata.namespace
- status=$.status.phase
- restart=$.spec.restartPolicy
- host_ip=$.status.hostIP
- pod_ip=$.status.podIP
- creation_time=$.metadata.creationTimestamp
- start_time=$.status.startTime
protocol: http
http:
host: ^_^host^_^
port: ^_^port^_^
url: /api/v1/pods
method: GET
ssl: true
authorization:
type: ^_^authType^_^
bearerTokenToken: ^_^token^_^
parseType: jsonPath
parseScript: '$.items.*'
- name: services
priority: 1
fields:
- field: service
type: 1
i18n:
zh-CN: 服务
en-US: Service
- field: namespace
type: 1
i18n:
zh-CN: 命名空间
en-US: Namespace
- field: type
type: 1
i18n:
zh-CN: 类型
en-US: Type
- field: cluster_ip
type: 1
i18n:
zh-CN: 集群IP
en-US: Cluster IP
- field: selector
type: 1
i18n:
zh-CN: 选择器
en-US: Selector
- field: creation_time
type: 1
i18n:
zh-CN: 创建时间
en-US: Creation Time
aliasFields:
- $.metadata.name
- $.metadata.namespace
- $.spec.type
- $.spec.clusterIP
- $.spec.selector
- $.metadata.creationTimestamp
calculates:
- service=$.metadata.name
- namespace=$.metadata.namespace
- type=$.spec.type
- cluster_ip=$.spec.clusterIP
- selector=$.spec.selector
- creation_time=$.metadata.creationTimestamp
protocol: http
http:
host: ^_^host^_^
port: ^_^port^_^
url: /api/v1/services
method: GET
ssl: true
authorization:
type: ^_^authType^_^
bearerTokenToken: ^_^token^_^
parseType: jsonPath
parseScript: '$.items.*'
# 修改下面的两个type为1
- field: capacity_memory
type: 1
i18n:
zh-CN: 内存容量
en-US: Memory Capacity
- field: allocatable_memory
type: 1
i18n:
zh-CN: 可分配内存
en-US: Allocatable Memory
# 将units节点注释掉
# units:
# - capacity_memory=Ki->Mi
# - allocatable_memory=Ki->Mi
Is there an existing issue for this?
Current Behavior
按照文档生成了对应的k8s认证token,然后监控集群,发现一直处于宕机状态,观察后端log发现如下:
Expected Behavior
希望可以正常监控k8s,采集对应的指标
Steps To Reproduce
Environment
Debug logs
2024-06-19 17:01:09.893 [339555282159616-kubernetes-nodes-9839] ERROR org.apache.hertzbeat.collector.dispatch.WorkerPool Line:48 - Thread Name 339555282159616-kubernetes-nodes-9839 : Character K is neither a decimal digit number, decimal point, nor "e" notation exponential mark. java.lang.NumberFormatException: Character K is neither a decimal digit number, decimal point, nor "e" notation exponential mark. at java.base/java.math.BigDecimal.(BigDecimal.java:582)
at java.base/java.math.BigDecimal.(BigDecimal.java:467)
at java.base/java.math.BigDecimal.(BigDecimal.java:896)
at org.apache.hertzbeat.collector.dispatch.unit.impl.DataSizeConvert.convert(DataSizeConvert.java:46)
at org.apache.hertzbeat.collector.dispatch.MetricsCollect.calculateFields(MetricsCollect.java:310)
at org.apache.hertzbeat.collector.dispatch.MetricsCollect.run(MetricsCollect.java:176)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Anything else?
No response