Open ghost opened 2 years ago
Pinging @elastic/fleet (Team:Fleet)
Secondary Review is done.
Can you check kibana logs to see if the error reason was that the agent is not upgradeable? I added a fix for that use case today, but if the error reason comes from the backend (agent or fleet server), then the errors should be already reported correctly in the activity.
Hi @juliaElastic,
Thank you for looking into this.
However, this agent upgrade issue is occurring due to the issue #139174
Further, please find the Kibana logs for the above issue: Kibana_logs.txt
Please let us know if we are missing anything.
Thanks!
I see, this should be fixed in the latest kibana version, previously the error action results were not reported correctly. This is how it looks now with the latest changes:
Hi @juliaElastic,
Thank you for looking into this.
We will be re-validating this issue on latest Kibana version.
Thanks!
Hi @juliaElastic,
We have re-validated this issue on the latest 8.5.0 BC2 Kibana Staging environment and found that the issue is still reproducible.
Build details:
Version: 8.5.0 BC2
Build: 56806
Commit: dc769f45a5a6dafb0a8c8f0c0cabcced4df45e11
Below are the observations:
one
OR more than one agent
after adding Incorrect URL in Agent Binary:
No appropriate agent upgrade failed message i.e.X1 of X agents upgraded`` A Problem occurred during this operation is shown
message is available under Today section in Agent activity
flyout if agents fails to upgrade to the latest version. Screen Recording and Screenshot:
less than
OR equal to
Kibana version (8.5.0):
An appropriate agent upgrade failed message i.e. X1 of X agents upgraded`` A Problem occurred during this operation is shown message is available under Today section in Agent activity flyout if agents fails to upgrade to the latest version.Screenshot:
Hence, we are re-opening
this issue.
Please let us know if we are missing anything.
Thanks!
@prachigupta-qasource I can't reproduce this locally, could you share the logs from agent, fleet server and kibana?
Hi @juliaElastic,
Please find the steps to reproduce the above issue:
https://test.elastic.co/downloads/
in Agent Binary under Fleet > Settings.one
OR more than one
agents.Agent activity
link.X agent/agents upgraded
text on Agent activity flyout.Agent Logs:
elastic-agent-diagnostics-2022-10-04T09-54-34Z-00.zip
Feet server Logs:
We are unable to fetch Feet server Logs due to the Hosted cloud environment.
Kibana Logs:
Please let us know if we are missing anything.
Thanks!
@prachigupta-qasource Please share the cloud link, so I can look at the instance in cloud admin to check the logs.
At step 2, did you update the Elastic Artifacts Host or did you add a new entry? If a new one, did you set it to default?
I am asking because I don't see any matches on https://test.elastic.co/downloads/
in elastic agent logs.
I still can't reproduce, if I try the steps, I see an error result.
I saw this error in the logs that you shared:
[elastic_agent][error] 2022-10-04T05:23:28-04:00 - message: Application: [16a94f9c-4165-477c-a210-64b8da0174a4]: State changed to FAILED: failed upgrade of agent binary: 2 errors occurred:
* package '/opt/Elastic/Agent/data/elastic-agent-d3eb3e/downloads/elastic-agent-8.5.0-linux-x86_64.tar.gz' not found: open /opt/Elastic/Agent/data/elastic-agent-d3eb3e/downloads/elastic-agent-8.5.0-linux-x86_64.tar.gz: no such file or directory
* call to 'https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.5.0-linux-x86_64.tar.gz' returned unsuccessful status code: 404
- type: 'ERROR' - sub_type: 'FAILED'
@michel-laterman Could you have a look at this issue? There seems to be an error happening on elastic agent side on upgrade, which looks like not reported correctly to agent action results.
I found another issue that has similar logging errors, can we verify on BC3 if the issue is still reproducible?
@juliaElastic, just so I understand; the error message appears in the logs and is expected to appear in the UI, correct?
IIRC at the moment the elastic-agent
sends a generic ack for most actions it receives that does not indicate a result (the application
action that osquery uses is an exception to this).
@michel-laterman there is an Error field in ActionResult
that indicates if something went wrong in the action, we use that field on the UI to indicate whether the action failed or not.
I have a suspicion that the error field is not set, that is why the action looks successful on the UI. However I can't reproduce so I can't verify this theory.
Kibana version: 8.5 Kibana Staging environment
Host OS and Browser version: All, All
Build Details:
Preconditions:
Steps to reproduce:
Upgrade 3 agents
pop-up is shown.Agent activity
link.Agent activity
flyout gets opened.3 agents upgraded
is shown on the flyout.Actual Result:
Expected Result:
Mock UI from Figma:
Screen Recording:
https://user-images.githubusercontent.com/97870262/191010887-2e11ce7f-ac08-473a-ac5b-ba35af01b766.mp4