Closed jobimrobinsantos-drizly closed 2 years ago
Hi @jobimrobinsantos-drizly
Could you post what version of the agent you trying to install? I assume latest, but want to make sure.
I'm setting threatstack_pkg: threatstack-agent
so it should be installing threatstack-agent=2*
It looks like it's actually installing 3.0.0!
$ tsagent --version
tsagent version 3.0.0
$ apt list | grep threatstack
threatstack-agent/bionic,now 3.0.0.0ubuntu18.105 amd64 [installed]
threatstack-agent-support/bionic 1.6.0 all
I have changed my playbook to say threatstack_pkg: threatstack-agent=2*
so that it will not install v3. I think that this part of the role is not doing what it is intended to do: https://github.com/threatstack/threatstack-ansible/blob/0e3c51e6a27c8d9b8d0325d0638fc4ab1d40e09c/tasks/apt_install.yml#L23-L27
Will be looking into it, @jobimrobinsantos-drizly . Will report back what I find. Thanks again for the report!
So locally, running the tests I have with 18.04, I see it installing 2.5.0, and succeeding. Changing that line in tasks/apt_install.yml
to threatstack-agent=3*
also succeeds in installing the agent.
So a couple of follow ups:
threatstack_pkg
to something for 3.0.0. Can you verify that.systemctl status threatstack
is outputting?
Thanks again for the report; hopefully we can get to the bottom of this.So looking some more on my side, I am getting a changed
setup checksum file:
TASK [threatstack-ansible : Create file to track checksum of setup string] *****
changed: [localhost]
Whereas your runs appear to already have a file (it returns ok
for you not changed
). This is appearing to lead your runs to skip checking if the agent is stopped, and skips running the actual setup command to register the agent. It also means it is skipping the restart of the agent service.
Could you check for a /opt/threatstack/etc/.setup_checksum
file, and assuming it is there, the creation/last modified date on it. I am guessing it will be a while ago. If I am correct, then deleting the file and rerunning will likely fix the immediate issue.
Unfortunately I already charged ahead and redeployed to install tsagent 2.5.0. It should be noted that I had to uninstall it first since the apt
task does not have allow_downgrade: true
on it. That redeployment resulted in a different checksum.
The output above was from my attempt to reinstall threatstack to fix the failure, so I would not expect the checksum to have changed
.
Side note: I've noticed a pattern of threatstack not starting back up after a server has been powered down for >24 hours. This is what I was investigating when I discovered that we had 3.0.0 installed. I'll have more info on this issue soon.
Hi @jobimrobinsantos-drizly
Regarding the side note you mentioned, that is expected behavior. The agent periodically communicates with our platform in the form of a "heartbeat" message. If the platform no longer receives these messages the agent is revoked from the platform and requires a re-registration.
Powering down servers for greater than 24 hours would lead to this.
Details regarding re-registration can be found here: https://threatstack.zendesk.com/hc/en-us/articles/205868529-Re-register-the-Threat-Stack-Linux-Agent
Happy to help further if you have any other questions/concerns.
To follow up with the issues you noted about this role, I think a flag to allow downgrade install is definitely worth adding. And the role should definitely be deploying 3.0.0 as latest, not the latest 2.X, so that can be fixed too.
I'll leave this ticket open for the fixes.
To follow up with the issues you noted about this role, I think a flag to allow downgrade install is definitely worth adding. And the role should definitely be deploying 3.0.0 as latest, not the latest 2.X, so that can be fixed too.
These issues should now be fixed in #89 .
I noticed that some servers where we had previously installed threatstack via this role were not running the service, so I tried restarting the service manually. When that failed, I tried re-running the playbook, which had the same result as my manual attempt:
Unable to start service threatstack: Job for threatstack.service canceled.
All servers are running Ubuntu 18.04. I am using v5.0.0 of this role.
Here is my playbook (with some redactions):
Here is the output of the play: