Open onnorom opened 5 years ago
I'm having the same issue.
Profiling is enabled on all volumes and collectors are set for gluster-exporter
but still no profile metrics.
$ glusterd -V
glusterfs 4.1.8
I've had an issue with a similar symptom : #151
To check if it's the same root cause, could you please try to :
gluster pool list
)log-level = "debug"
in your gluster-exporter.tomlIf you're seeing logs like level=debug msg="Error getting profile info" error="exit status 1" volume=[volume_name]
, it's probably the same issue.
Thanks! That was the same issue and it's fixed now.
Start Profiling
You must start the Profiling to view the File Operation information for each brick.
To start profiling, use following command:
# gluster volume profile start
For example, to start profiling on test-volume:
# gluster volume profile test-volume start
Profiling started on test-volume
When profiling on the volume is started, the following additional options are displayed in the Volume Info:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
enable "gluster_volume_profile" in config see config template https://github.com/gluster/gluster-prometheus/blob/master/extras/conf/gluster-exporter.toml.sample
[collectors.gluster_volume_profile]
name = "gluster_volume_profile"
sync-interval = 5
disabled = false
Because only isleader of the nodes is selected to collect data。finds the peer with the maximum UUID (lexicographically) is leader.
like this peer list,only get Profiling data in the maximum UUID(b2157fd6-4d7a-485e-b21d-1c3785ab3fbd),because it is leader
[root]# gluster pool list
UUID Hostname State
91acc359-eee7-4faf-b47b-692351bd3fd9 192.63.1.19 Connected
b2157fd6-4d7a-485e-b21d-1c3785ab3fbd 192.63.1.18 Connected
5a8c3b1f-21e2-4657-baf7-48fe272fcbfc 192.63.1.110 Connected
57f9c5fa-2dfa-4fc7-912c-619cfb047170 192.63.1.16 Connected
a0a13141-b402-46ca-97a2-5d3703283626 10.63.1.17 Connected
13b99272-b7e4-4aee-b3bf-ec8d456c04e8 localhost Connected
in the code https://github.com/gluster/gluster-prometheus/blob/master/pkg/glusterutils/exporterd.go
// IsLeader returns true or false based on whether the node is the leader of the cluster or not
func (g *GD1) IsLeader() (bool, error) {
setDefaultConfig(g.config)
peerList, err := g.Peers()
if err != nil {
return false, err
}
peerID, err := g.LocalPeerID()
if err != nil {
return false, err
}
var maxPeerID string
//This for loop iterates among all the peers and finds the peer with the maximum UUID (lexicographically)
for i, pr := range peerList {
if pr.Online {
if peerList[i].ID > maxPeerID {
maxPeerID = peerList[i].ID
}
}
}
//Checks and returns true if maximum peerID is equal to the local peerID
if maxPeerID == peerID {
return true, nil
}
return false, nil
}
@khalid151 @Neraud @onnorom @csabahenk There's no problem, deploy on all nodes can get volume profile https://github.com/gluster/gluster-prometheus/issues/147#issuecomment-743010344
gluster pool list hello I followed your method and did not solve the problem,
Options Reconfigured: transport.address-family: inet nfs.disable: on performance.client-io-threads: off auth.allow: * diagnostics.latency-measurement: on diagnostics.count-fop-hits: on
gluster-exporter.toml
[globals]
gluster-cluster-id = ""
gluster-mgmt = "glusterd"
glusterd-dir = "/var/lib/glusterd"
gluster-binary-path = "gluster"
# If you want to connect to a remote gd1 host, set the variable gd1-remote-host
# However, using a remote host restrict the gluster cli to read-only commands
# The following collectors won't work in remote mode : gluster_volume_counts, gluster_volume_profile
#gd1-remote-host = "localhost"
gd2-rest-endpoint = "http://localhost:24007"
port = 9713
metrics-path = "/metrics"
log-dir = "/var/log/gluster-exporter"
log-file = "exporter.log"
log-level = "info"
# cache-ttl-in-sec = 0, disables caching
cache-ttl-in-sec = 30
# by default caching is turned off
# to enable caching, add the function-name to 'cache-enabled-funcs' list
# supported functions are,
# 'IsLeader', 'LocalPeerID', 'VolumeInfo'
# 'EnableVolumeProfiling', 'HealInfo', 'Peers',
# 'Snapshots', 'VolumeBrickStatus', 'VolumeProfileInfo'
cache-enabled-funcs = [ 'IsLeader', 'LocalPeerID', 'VolumeInfo' ]
[collectors.gluster_ps]
name = "gluster_ps"
sync-interval = 5
disabled = false
[collectors.gluster_peer_counts]
name = "gluster_peer_counts"
sync-interval = 5
disabled = false
[collectors.gluster_peer_info]
name = "gluster_peer_info"
sync-interval = 5
disabled = false
[collectors.gluster_brick]
name = "gluster_brick"
sync-interval = 5
disabled = false
[collectors.gluster_brick_status]
name = "gluster_brick_status"
sync-interval = 15
disabled = false
[collectors.gluster_volume_counts]
name = "gluster_volume_counts"
sync-interval = 5
disabled = false
[collectors.gluster_volume_status]
name = "gluster_volume_status"
sync-interval = 5
disabled = false
[collectors.gluster_volume_heal]
name = "gluster_volume_heal"
sync-interval = 5
disabled = false
[collectors.gluster_volume_profile]
name = "gluster_volume_profile"
sync-interval = 5
disabled = false
i don't have gluster_thinpoolmetadata* @limiao2008
Hi, I am wondering if there's anything i may be missing that need to be enabled in order for me to get gluster_volumeprofile* measurements? I have set the necessary collectors in the /etc/gluster-exporter/gluster-exporter.toml files as below:
[collectors.gluster_volume_profile] name = "gluster_volume_profile" sync-interval = 5 disabled = false
[collectors.gluster_volume_counts] name = "gluster_volume_counts" sync-interval = 5 disabled = false
[collectors.gluster_volume_heal] name = "gluster_volume_heal" sync-interval = 5 disabled = false
However, i don't see any measurement collected with that name.
I do see the following measurements:
gluster_brick_capacity_bytes_total gluster_brick_capacity_free_bytes gluster_brick_capacity_used_bytes gluster_brick_inodes_free gluster_brick_inodes_total gluster_brick_inodes_used gluster_brick_lv_metadata_percent gluster_brick_lv_metadata_size_bytes gluster_brick_lv_percent gluster_brick_lv_size_bytes gluster_brick_up gluster_cpu_percentage gluster_elapsed_time_seconds gluster_memory_percentage gluster_process:gluster_cpu_percentage:avg1h gluster_process:gluster_elapsed_time_seconds:rate5m gluster_process:gluster_memory_percentage:avg1h gluster_resident_memory_bytes gluster_subvol_capacity_total_bytes gluster_subvol_capacity_used_bytes gluster_vg_extent_alloc_count gluster_vg_extent_total_count gluster_virtual_memory_bytes gluster_volume_heal_count gluster_volume_split_brain_heal_count