[BUG] No shards indexed after 25th Feb 23:59:990

edmitchellVS commented 2 years ago

Describe the issue A clear and concise description of what the issue is.

I have ran through the upgrade process recently to address the L4J version issue and all was running fine. I could see data no problem, I updated ubuntu on Wed 23rd and the server was restarted on Friday 25th am. All good so far, checked LME on the Following Wed (2nd March) and no data showing in the discovery section for last 15 mins. I increased the time filter until it hit 25th Feb and could the last log was for 25th Feb @ 23:59:990. I had a look at the shards and I can see the last of the indices was indeed 25th Feb. However it seems WinLogBeats on the event forwarder is still sending logs to LME error free but i am not sure as to how to check the logs on the LME server. Is this perhaps my not following the upgrade process correctly from 03 --> 04, the L4J upgrade fix or something else?

To Reproduce Steps to reproduce the behavior:

Go to 'Analytics / Discover'
Click on 'refresh'
Scroll down to '....'
See error - No results match your search criteria, Expand your time range Try searching over a longer period of time.
Expand this to 25th Feb and I can see data

Expected behavior see any events for the last 15 mins

Screenshots If applicable, add screenshots to help explain your problem.

Windows Event Collector (please complete the following information):

OS: Windows Server 2019
WEC Config V0.1
Winlogbeat Config Unsure but is latest version in response to L4J. Build hash = 1907c246c8b0d23ae4027699c44bf3fbef57f4a4
Winlogbeat.exe version [winlogbeat-7.13.4-windows-x86_64]
sysmon config [052754edd202e1380657c8d9d5d6ec49]
sysmon executable 13.10

Linux Server (please complete the following information):

Docker: [e.g. Docker version 18.09.3] UNSURE
Docker compose stack file version: [e.g. version 0.1] UNSURE
Linux: Ubuntu 20.04.4 LTS
Logstash Version [e.g. #LME logstash config V0.1] UNSURE

Additional context Add any other context about the problem here.

adam-ncc commented 2 years ago

Hey @edmitchellVS, if you were getting data in with no issues till this point it seems unlikely to be an issue with the v0.4 upgrade, and the minor updates shouldn't cause any compatibility issues I believe. It may be possible you've run out of disk space as discussed here, would you be able to check if this was the issue?

It would also be useful if you could post the logs (minus any sensitive information) from logstash/kibana/elastic using the following commands, which may help us to diagnose the problem:

sudo docker service logs lme_elasticsearch --tail 20 --timestamps
sudo docker service logs lme_kibana --tail 20 --timestamps
sudo docker service logs lme_logstash --tail 20 --timestamps

Thanks

edmitchellVS commented 2 years ago

Hi Adam,

Many thanks for this, I have ran the commad above and I can see where the error is.

:response=>{"index"=>{"_index"=>"winlogbeat-09.03.2022", "_type"=>"_doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [4] shards, but this cluster currently has [1000]/[1000] maximum normal shards open;"}}}}

I have changed the data retention to 180 days and this now seems to have fixed the issue... May many many thanks for this!

Sorry I also forgot to mention another error message i was getting...

"4 of 708 shards failed The data you are seeing might be incomplete or wrong."

This happens when going to the user investigator tab in the security dash board. Do you think this will fix that or do I need to do something else?

Thanks again

Ed

edmitchellVS commented 2 years ago

Issue now completely resolved. Thanks again for your help with this one 👍

ukncsc / lme

[BUG] No shards indexed after 25th Feb 23:59:990 #132