Open fish-not-phish opened 4 months ago
I was looking through other issues and I noticed Issue 23 which talks about the Fleet server not coming back after reboot. I noticed the user mentioned version 8.5 still works. I went ahead and rolled back to that version within the repository history and that version does indeed work without any errors. A temporary fix, but obviously it would be preferred to run the most recent version.
Do you have this issue with 8.13
?
If you want to test, you can to a ./elastic-container.sh destroy
and then start fresh.
I'll also look into this.
Do you have this issue with
8.13
?If you want to test, you can to a
./elastic-container.sh destroy
and then start fresh.I'll also look into this.
I was using STACK_VERSION=8.12.2
, I have not tried 8.13. I might go ahead and try to see if that works. I will let you know if 8.13
works or not.
Sadly, I have already destroyed and started fresh and I get the same result each time.
Doesn't appear to work with 8.13
.
I tried changing STACK_VERSION=8.13.0
within .env.
Tried 8.4.3 and it is worked.However, the portal responses very slowly.
Tried 8.13.4, 8.12.2 and 8.12.0 and all are failure.
Thanks for your patience. I will open an Issue upstream.
When I deployed, the Fleet server wasn't healthy ever. I ran sh elastic-container.sh restart
and then everything was healthy and Fleet was available. That's not a good solution, but it can work as a temporary solution. I wonder if there is some race condition where if one of the other containers isn't up and healthy, Fleet chokes and doesn't self-heal.
I'll try a few "relies on" options.
Even when it is healthy in Kibana, it never shows healthy in Docker.
./elastic-container.sh status
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
ecp-elasticsearch docker.elastic.co/elasticsearch/elasticsearch:8.14.1 "/bin/tini -- /usr/l…" elasticsearch 5 minutes ago Up 5 minutes (healthy) 0.0.0.0:9200->9200/tcp, 9300/tcp
ecp-fleet-server docker.elastic.co/beats/elastic-agent:8.14.1 "/usr/bin/tini -- /u…" fleet-server 5 minutes ago Up 44 seconds 0.0.0.0:8220->8220/tcp
ecp-kibana docker.elastic.co/kibana/kibana:8.14.1 "/bin/tini -- /usr/l…" kibana 5 minutes ago Up 4 minutes (healthy) 0.0.0.0:5601->5601/tcp
But it wasn't healthy in Kibana until I did a restart. I tried just restarting the Fleet container and the whole stack. Both brought Fleet online.
I'll follow up here with the Elastic Issue for tracking.
Thanks, and it means that 8.14.1 can fix the above issue by mannual restart kibana or all dockers. Am i correct?
Your comment is appreciated.
I believe I tried it both ways and both worked.
I tried but fleet server cannot be displayed such as screen1. However, screen2 can show that the fleet server is running.
Screen 1
Screen 2
Tried to do once 8.14.1 but situation is same
Tried to test previous version and just version 8.8.2 can execute the elastic-container.sh to build all sucessfully.
any update on this still facing this issue with 8.14.3 as well
Tried with 8.15.0 and it is still not working such as following.
Finally after 2 hrs of troubleshooting found a workaround with 8.14.0 use this commit code - https://github.com/peasead/elastic-container/tree/0ef92f1e7bce33ca5c42bbe545630fe18c5bf028 copy code from each file and replace in your local files, recheck the .env file that should have STACK_VERSION=8.14.0 try this it will work 100% if you have more doubts on deployment reach out to me on linkedin i can help you - https://www.linkedin.com/in/saibatchu/
Finally after 2 hrs of troubleshooting found a workaround with 8.14.0 use this commit code - https://github.com/peasead/elastic-container/tree/0ef92f1e7bce33ca5c42bbe545630fe18c5bf028 copy code from each file and replace in your local files, recheck the .env file that should have STACK_VERSION=8.14.0 try this it will work 100% if you have more doubts on deployment reach out to me on linkedin i can help you - https://www.linkedin.com/in/saibatchu/
This actually seemed to work for me as well. I will update in 1-2 weeks if there are any health concerns regarding the fleet. I have a VM with a large amount of resources allocated to it, so there should not be any resource-related issues.
Hey @fish-not-phish I'm jumping in to get this issue fixed. I just pushed a change to main in the shell script that fixes an issue with Fleet settings being properly populated. I just tested on macOS standing up a fresh stack and everything works as advertised. Can you pull main again and try standing up a stack then letting me know if you still experience a problem?
You also should not have to change the LOCAL_KBN_URL value
I tried to test on Ubuntu and no luck such as following
I'll do a test on Ubuntu today and see if I can't figure out what's going on.
Recently I use this repo for deploying ELK for testing and studing. It works like a charm in my computer so I decided to put it to work in a server and Fleet did not work.
After hours debugging I realize it has deploy problems in current Debian Stable (bookworm) but it works perfect in current Debian Testing (trixie).
Maybe one of the above cases may solve just using another Docker Host OS. Hope this will be useful to anyone.
I have no skills enough in ELK to realize where the problem resides on this change.
Hello,
I am having some problems deploying this stack as it appears the script is not running as expected. I am running on Ubuntu 22.04 LTS, fresh install. The virtual machine has 8 CPU cores, 16GB of RAM, and 500GB of storage, so I don't suspect a resource issue.
I edited the .env file, changing these 4 items:
LOCAL_KBN_URL=https://192.168.0.X:5601
ELASTIC_PASSWORD=<redacted_password>
KIBANA_PASSWORD=<redacted_password>
WindowsDR=1
No other modifications were made.
When I run the script
sudo ./elastic-container.sh start
, it runs and appears to set up the necessary containers. However, the output does not match what would be expected. The output I get is this:However, I never get this included in the output. It's simply missing:
When I try to go to the URL, it isn't up, and is only accessible if I run the restart option for the script:
sudo ./elastic-container.sh restart
. Then the URL becomes accessible, but the fleet settings are not configured.When I run
sudo ./elastic-container.sh status
, this is the output:I noticed that the
ecp-fleet-server
is online but is not denoted as "healthy". I checked the docker logs and observed connection refused errors:Reading this, I do understand that there may be a connectivity issue, however I am not running and UFW and I have not altered IP Tables. The Proxmox firewall for this Virtual Machine is off - so that shouldn't have any impact either.
Any help would be greatly appreciated to get this working.