Open fearful-symmetry opened 5 months ago
Pinging @elastic/fleet-qasource-external (Team:Fleet-QA)
Hi @fearful-symmetry
We have tested this feature on latest 8.15.0 SNAPSHOT kibana cloud environment and had below observations:
Observation Table:
S.no. | HostOS | Data under metricbeat-* | Data under metricbeat-* without –cgroupns=host | Non fatal error fetching PID some info | Error fetching PID info for | GetInfoForPid: |
---|---|---|---|---|---|---|
1 | Ubuntu 16.04 | Available | Available | No Errors observed | No Errors observed | No Errors observed |
2 | Ubuntu 20.04 | Available | Available | No Errors observed | No Errors observed | No Errors observed |
3 | Ubuntu 24.04 | Available | Available | No Errors observed | No Errors observed | No Errors observed |
4 | Rhel 7 | AWS Template not Working | AWS Template not Working | NA | NA | NA |
5 | Rhel 8 | Available | Available | No Errors observed | No Errors observed | No Errors observed |
6 | Rhel 9 | Available | Available | No Errors observed | No Errors observed | No Errors observed |
Artifact used: docker.elastic.co/beats/metricbeat:8.15.0-ee48b214-SNAPSHOT metricbeat
Further we were getting authentication errors so we have added authentication under the install command:
sudo docker run --label co.elastic.metrics/module=system \
--mount type=bind,source=/proc,target=/hostfs/proc,readonly \
--mount type=bind,source=/sys/fs/cgroup,target=/hostfs/sys/fs/cgroup,readonly \
--mount type=bind,source=/,target=/hostfs,readonly \
--mount type=bind,source=/var/run/dbus/system_bus_socket,target=/hostfs/var/run/dbus/system_bus_socket,readonly \
--env DBUS_SYSTEM_BUS_ADDRESS='unix:path=/hostfs/var/run/dbus/system_bus_socket' \
--net=host --cgroupns=host docker.elastic.co/beats/metricbeat:8.15.0-ee48b214-SNAPSHOT metricbeat -e -E output.elasticsearch.hosts='https://host-url:443' \
-E output.elasticsearch.username='elastic' \
-E output.elasticsearch.password='password' \
-d '*'
- In elasticsearch, ensure that there are documents with metricset.name matching process
For this we have tested metricbeat-* under Discover tab.
- in the debug logs, ensure that there are no log lines that contain the strings:
For this we have searched the CLI logs where metricbeat is running
Logs with cgroups: with cg.txt
Logs without cgroups: without cg.txt
Further, could you please share a working AWS- Rhel 7 template as the AWS-Rhel 7 templates we are using below errors are observed on running any install commands.
Please let us know if we are missing anything here.
cc: @pierrehilbert
Thanks!!
Looked at the logs, nothing seems suspicious.
@amolnater-qasource
Further, could you please share a working AWS- Rhel 7 template as the AWS-Rhel 7 templates we are using below errors are observed on running any install commands.
I've tested this myself entirely in local VMs, so I can't comment on any AWS-specific configs needed.
Based on the screenshot above, it looks like the elasticsearch check may be incorrect. We need to check for the presence of documents with metricset.name = process
, but the screenshot above appears to show process.name=metricbeat
Hi @fearful-symmetry
Thank you for the update, we have applied metricset.name : "process"
under Discover tab.
Screen Captures:
https://github.com/elastic/beats/assets/77374876/e73653c6-e7dd-4870-af17-dd36f9c74f33
Please let us know if we are still missing anything here.
Thanks!
The data looks correct, I guess we can make do without RHEL 7 for now.
@amolnater-qasource purely out of curiosity, can you run docker version | grep "Version"
on all of the different hosts you've tested and return the result? I'd like to know if we're getting an even spread of different docker versions across the VM. I suspect we're not, but I want to be sure.
Hi @fearful-symmetry
Thank you for the update.
We had docker version 20.10.7
on Ubuntu 16 and on all other OS's Rhel 8, Rhel 9, Ubuntu 20, Ubuntu 24 we had docker version 26.1.4
.
Please let us know if anything else is required from our end.
Further for the regression testcases could you please confirm if we should create 1 testcase for any of the 1 linux version or we should create testcases for all the 5 OS's. tested.
Thanks!
Yeah, I'm kind of tangentially worried about the versions of docker being used, since different docker engines could impact namespace settings, etc
Further for the regression testcases could you please confirm if we should create 1 testcase for any of the 1 linux version or we should create testcases for all the 5 OS's. tested.
@amolnater-qasource not sure what you mean? By "regression testcases" do you mean running this test in future?
Yeah, I'm kind of tangentially worried about the versions of docker being used, since different docker engines could impact namespace settings, etc
Do you want us to test it with different docker versions or any specific versions?
@amolnater-qasource not sure what you mean? By "regression testcases" do you mean running this test in future?
Yes, as added in description the tests need to be run future, so we need to convert it into testcases in Fleet test suite. We just wanted to confirm if we should create testcases for just 1 platform or all platforms tested above?
Thanks!
Yes, as added in description the tests need to be run future, so we need to convert it into testcases in Fleet test suite. We just wanted to confirm if we should create testcases for just 1 platform or all platforms tested above?
ah, alright. Yes, this should be run under multiple linuxes for future tests.
As far as Docker versions, I'm going to have to do some research and figure out if there's some particular docker engine changes we'd be interested in.
Hi @fearful-symmetry
We have created 05 testcases under Testmo for this feature under Fleet test suite at links:
Please let us know if any other scenario needs to be added from our end.
Thanks!
Hi Team,
We have executed 05 testcases under the Feature test run for the 8.15.0 release at the link:
Status:
PASS: 05
Build details: VERSION: 8.15.0 BC4 BUILD: 76261 COMMIT: 9d62937675e62265342e86d8f0db601dc75498b8 Artifact Link: docker.elastic.co/staging/metricbeat:8.15.0-a7432175
As the testing is completed on this feature, we are marking this as QA:Validated.
Please let us know if anything else is required from our end. Thanks
In the past months, we've run into a considerable amount of bugs when it comes to monitoring host metrics while running under docker. I'm making these test steps in the hope that this can be a regular set of tests that are run with every release.
Steps to test
1) Run metricbeat via docker with the following:
2) In elasticsearch, ensure that there are documents with
metricset.name
matchingprocess
3) in the debug logs, ensure that there are no log lines that contain the strings:Non fatal error fetching PID some info
Error fetching PID info for
GetInfoForPid:
4) Repeat steps 1-3, but omit the--cgroupns=host
config line:Test Targets
This should be run under docker on linux, and preferably tested across a range of linux distros from our support matrix, at least: