ukncsc / lme

Logging Made Easy
Apache License 2.0
708 stars 117 forks source link

[BUG] Missing indices when opening dashboards #77

Closed MichaelGibsonAltrad closed 3 years ago

MichaelGibsonAltrad commented 4 years ago

Describe the issue Brand new install of LME 0.3 for a user who doesn't have much Linux knowledge using your convenience scripts. Install went fine without any issues or errors and I can access the kibana portal and see data from the test batch of machines which are checking in.

When opening various dashboards, we are seeing "Could not locate that index-pattern-field (id:". The config file imported correctly and I did reimport it successfully but this made no difference.

To Reproduce Steps to reproduce the behavior:

  1. Go to Dashboard
  2. Click on 'Security Dashboard'
  3. Click on 'Threats'
  4. Error in section 'Temporary Files in Downloads Folder' -> 'Could not locate that index-pattern-field (id: winlog.event_data.TargetFilename.keyword)'
  5. Error in section 'DNS Overview' -> 'Could not locate that index-pattern-field (id: winlog.event_data.QueryName.keyword)'
  6. Error in section 'Non Microsoft processes running in as admin' -> 'Could not locate that index-pattern-field (id: winlog.event_data.Image.keyword)'

There are errors on different dashboards with different index names as well.

Expected behavior Data to appear in the dashboard section.

Windows Event Collector (please complete the following information):

Linux Server (please complete the following information):

Additional context Add any other context about the problem here.

MichaelGibsonAltrad commented 3 years ago

Does anybody have any advice on how to fix this issue?

Thanks Michael

MichaelGibsonAltrad commented 3 years ago

I've been looking at this error and I have noticed that some of the missing indices are only in the "dashboards v0.2.0.ndjson" file (I haven't checked them all yet), so when you import the v0.3.0 file it overwrites the winlogbeat index settings and wipes these out. Is there a way to merge import both files into the config or does it need copying and pasting from the v0.2.0 file into the 0.3.0 file?

a-d-a-m-b commented 3 years ago

Hi @MichaelGibsonAltrad. You only need to import the 0.3.0 file as it hold the latest dashboards with the field references.

Regarding the other issues you are having, are they the same as #76?

MichaelGibsonAltrad commented 3 years ago

Hi Adam B. That's what I thought when first setting this up, but if you examine the v0.3.0 file in a text editor, the named indices are not there e.g. "winlog.event_data.TargetFilename.keyword" but they are in the v0.2.0 file. Any ideas?

Thanks Michael

a-d-a-m-b commented 3 years ago

There are 3 instances of winlog.event_data.TargetFilename.keyword in the dashboards.0.3.0.ndjson file. More so without the keyword.

https://raw.githubusercontent.com/ukncsc/lme/master/Chapter%204%20Files/dashboards%20v0.3.0.ndjson

Can you remove all objects and re-install them using a fresh v0.3.0 file (and also find/replace your domain or IP)?

MichaelGibsonAltrad commented 3 years ago

I've downloaded the file again, changed the relevant entries and deleted everything from under Saved Objects. I've imported the new file (354 objects imported, which matches the number at the bottom of the file) then went into Index Patterns -> winlogbeat-* and searched for the missing indices the dashboards are complaining about and winlog.event_data.TargetFilename.keyword is still not there.

If I do the above and import the v0.2.0 file it shows up the missing indices in Index Patterns, so it definitely looks like there's something missing in the v0.3.0 file.

MichaelGibsonAltrad commented 3 years ago

I've been comparing the v0.2.0 ndjson file to the v0.3.0 file and the definition is missing. In v0.3.0 on the first line beginning:

{"attributes":{"fieldFormatMap

it's missing the definition for some of the indexes which I'm seeing in the dashboard errors. If I find the entry on that line (see below), copy everything between the entries curly braces {}, paste it into the v0.3.0 file and change the updated_at date/time right at the end then reimport it, the index is now in the indices list:

{\"name\":\"winlog.event_data.TargetFilename.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.TargetFilename\",\"subType\":\"multi\"}

MichaelGibsonAltrad commented 3 years ago

I've been comparing the v0.2.0 ndjson file to the v0.3.0 file, and from my extremely limited understanding of this file, the definitions seem to be missing. In v0.3.0 on line 1 beginning:

{"attributes":{"fieldFormatMap

it's missing the definition for some of the indexes which I'm seeing in the dashboard errors. If I find the same entry in the v0.2.0 file (line 3), copy everything between the entries curly braces {} related to the missing indices, paste it into the v0.3.0 file, delete all the imported objects and the winlogbeat index patterns and index then reimport the file, the reports now work and I don't get the errors. The below index entries are missing:

winlog.event_data.TargetFilename.keyword winlog.event_data.QueryName.keyword winlog.event_data.Image.keyword winlog.event_data.Status.keyword winlog.event_data.DestinationIpGeo.country_iso_code.keyword winlog.event_data.User.keyword winlog.event_data.CommandLine.keyword winlog.event_data.ParentImage.keyword winlog.event_data.ImageLoaded.keyword

and adding the below lines from v0.2.0 fixes it:

{\"name\":\"winlog.event_data.TargetFilename.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.TargetFilename\",\"subType\":\"multi\"} {\"name\":\"winlog.event_data.QueryName.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.QueryName\",\"subType\":\"multi\"} {\"name\":\"winlog.event_data.Image.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.Image\",\"subType\":\"multi\"} {\"name\":\"winlog.event_data.Status.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.Status\",\"subType\":\"multi\"} {\"name\":\"winlog.event_data.DestinationIpGeo.country_iso_code.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.DestinationIpGeo.country_iso_code\",\"subType\":\"multi\"} {\"name\":\"winlog.event_data.User.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.User\",\"subType\":\"multi\"}, {\"name\":\"winlog.event_data.CommandLine.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.CommandLine\",\"subType\":\"multi\"} {\"name\":\"winlog.event_data.ParentImage.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.ParentImage\",\"subType\":\"multi\"} {\"name\":\"winlog.event_data.ImageLoaded.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"parent\":\"winlog.event_data.ImageLoaded\",\"subType\":\"multi\"}

I still have one error in "New - User Security -> Network connections by country" about the missing index listed below, and this is not in the v0.2.0 file

destination.ip_geo.country_name.keyword

Any ideas on this missing index?

Thanks Michael

ipswichschool commented 3 years ago

Having the same issues here with my install so definitely a global problem.

splurggy commented 3 years ago

Yes having the same issues here with the version 3 of the dashboard file - opening some of the dashboards have missing index's - so - fresh install of LME - there are a mixture of Index's that are missing - is anybody looking into this?>

a-d-a-m-b commented 3 years ago

Hello. We are aware of the issue, the root cause and have a mechanism to fix - we are looking at the easiest way of providing that solution to everyone

ipswichschool commented 3 years ago

Excellent.

splurggy commented 3 years ago

Has there been any movement on this ? seems the issue is a few months old? on most of the dashboards we have a lot of missing data.

a-d-a-m-b commented 3 years ago

Hi Splurggy - we're just planning the release of the fix at the moment. There are three actions required, but they need to be coordinated so that it doesn't 1) break more things 2) makes it suitable for brand new installs 3) provides a clear mechanism to fix existing installs (such as your case).

We can't automate all the fixes unfortunately, as this will take even more time, so some actions will need to be done manually and thus documentation needs to be written and we need to ensure that it is accessible to all.

Hoping to have some news/fix next week.

splurggy commented 3 years ago

Any update ? been 20 days since you responded, LME is being pushed to be used, but in this state its showing a fair amount of errors making it not reliable, would be great if you could update us.

a-d-a-m-b commented 3 years ago

Hi @splurggy - we have a bunch of fixes planned for a new release that'll be out soon, but I'll see if we can get the indexing issue shared under its own branch sooner.

Elastic recently released v7.11.0 yesterday, so we would like to make sure that the new version also keeps pace with the latest versions. This has delayed the release due to testing etc, but I'm sure you'll agree that its for the best.

I'll comment here and tag you when we have the fix for the index (either in it's own branch or when master is updated)

splurggy commented 3 years ago

Any update on the new version ???

a-d-a-m-b commented 3 years ago

Hi @splurggy, @ipswichschool @MichaelGibsonAltrad (and everyone else who might be reading)

A new branch was pushed to the repo last night (https://github.com/ukncsc/lme/tree/0.4-pre-release). It is a pre-release of the next version which includes a fix for this issue.

Details of how to upgrade are in the https://github.com/ukncsc/lme/blob/0.4-pre-release/docs/upgrading.md document, which provides some background to the issue, and how to resolve.

If you are willing to test and report back if things are resolved, that would be helpful. Whilst the code changes have passed our testing, we would still recommend making a snapshot of your logs if you are relying on them.

A full changelog includes:

MichaelGibsonAltrad commented 3 years ago

I'm more than happy to give it a try, if you can advise how I can get the new files without wiping out my current config. I'm not an expert on Linux or Git.

Thanks Michael

MichaelGibsonAltrad commented 3 years ago

I don't know if I'm doing this correctly, but I renamed the existing lme folder to lme.old then ran the below to clone the 0.4 branch to the lme folder.

sudo git clone --branch "0.4-pre-release" https://github.com/ukncsc/lme.git /opt/lme/

When I go into the lme folder and run .\deploy.sh upgrade it asks for the elastic password then errors

Enter the password for the existing elastic user: [!] The password you have entered was invalid or the elastic service did not respond, please try again.

I know this password is correct, as I can log into elastic using elastic and the password I enter into the deploy script.

adam-ncc commented 3 years ago

Ah apologies Michael, we could have made the process for migrating to the new branch a bit clearer from a git perspective, we will updated the documentation shortly with this change.

I the meantime would you be able to clarify whether the existing LME instance you're trying to upgrade is installed and still working on the current version? (Running sudo docker stack ps lme should be able to confirm this for you). If so you should be able to delete the new /opt/lme folder you made as part of your new git clone and rename the lme.old folder back to its original location.

Once the original files have been restored you should be able to update the folder contents and move into the correct branch with the following commands:

sudo git -C /opt/lme/ pull
cd /opt/lme
sudo git checkout 0.4-pre-release

This will update the files in /opt/lme to the correct version. From there you should be able to call the upgrade script as you tried above and should all work for you, keeping in mind that you'll still need to follow the rest of the upgrade instructions (including re-indexing) documented in the upgrade documentation (https://github.com/ukncsc/lme/blob/0.4-pre-release/docs/upgrading.md) if you wish to keep your existing data.

I'll keep an eye on this issue so if you run into any trouble with the above feel free to let us know and we can take another look. As this is technically pre-release it might also be worth taking a backup of your existing data if it's of high value and you're concerned about losing any of it, which can be done following the instructions here: https://github.com/ukncsc/lme/blob/master/docs/backups.md

As a further note, as this is a pre-release branch, at some point in the future we will likely fold these changes across into the main release once we're ready for them. When this happens to continue recieving automatic updates you'll need to change back to the master branch, but we'll make sure it's clear when this is required and it will simply be a case of repeating the commands I've posted above and changing the checkout command to reference the "master" branch instead. Just to give you a heads up.

MichaelGibsonAltrad commented 3 years ago

Thanks Adam, that allowed me to run through most of the upgrade process and I checked it was running using your sudo docker stack ps lme command which showed the new version number. After 30 minutes though, I still couldn't connect to Kibana "Kibana server is not ready yet" to verify it was working and before I tried to reindex the data, so I restarted the server.

ID                  NAME                  IMAGE                                                  NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
rimsekbiufow        lme_kibana.1          docker.elastic.co/kibana/kibana:7.11.2                 xxxxxxx          Running             Running 25 minutes ago
f78ofjrw2srf        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxxx           Running             Running 25 minutes ago
10d6tx9ds5ti        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxxx           Running             Running 25 minutes ago

When checking the status again, I now have six container entries (I only had the three new ones before the restart):

ID                  NAME                  IMAGE                                                  NODE                DESIRED STATE       CURRENT STATE           ERROR                              PORTS
dh777k2ohlgs        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxx           Running             Running 2 minutes ago
w5xj5da0yg8r        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxx           Running             Running 2 minutes ago
rv5yu9nfz8i5        lme_kibana.1          docker.elastic.co/kibana/kibana:7.11.2                 xxxxxxx           Running             Running 2 minutes ago
rimsekbiufow         \_ lme_kibana.1      docker.elastic.co/kibana/kibana:7.11.2                 xxxxxxx           Shutdown            Failed 2 minutes ago    "No such container: lme_kibana…"
f78ofjrw2srf        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxxx           Shutdown            Failed 2 minutes ago    "No such container: lme_elasti…"
10d6tx9ds5ti        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxxx          Shutdown            Failed 2 minutes ago    "No such container: lme_logsta…"

Any ideas?

Thanks Michael

adam-ncc commented 3 years ago

It doesn't look as though the three containers at the bottom of the list are running, they're just the old containers from pre-reboot so that doesn't seem like anything to worry about (it's a bit hard to see in the formatting but only three show "Running"). Has rebooting resolved the issue with Kibana beyond this or are you still stuck at the "Kibana server is not ready yet" message?

If you're still stuck would you be able to take a copy of the Kibana logs and post them here with any sensitive information redacted from them and I'll take a look? You can grab this with the following command: sudo docker service logs lme_kibana --tail 20 --timestamps. Also any additional information on the exact steps you took during the upgrade and where you got up to would be helpful if you're happy to share the information.

MichaelGibsonAltrad commented 3 years ago

Hi Adam,

Yes, only three are running, as the other three were in the failed state "No such container". I've restarted again, and we now have six entries of no such container (bottom six in bold), so it appears that a restart creates new docker entries with new ID's.

ID                  NAME                 
8rdri1lps0zn        lme_elasticsearch.1 
4fb1unckfubo        lme_kibana.1        
jkpo0c8rjvpa        lme_logstash.1      
**dh777k2ohlgs         \_ lme_logstash.1  
w5xj5da0yg8r        lme_elasticsearch.1 
rv5yu9nfz8i5        lme_kibana.1        
rimsekbiufow         \_ lme_kibana.1    
f78ofjrw2srf        lme_elasticsearch.1 
10d6tx9ds5ti        lme_logstash.1**    

We're still stuck with the error "Kibana server is not ready yet" and below are the log entries as requested. Update process was exactly as documented in your guide, and we got just above "So, how do I fix my current data?" and stopped as I need to log into Kibana to update the SIEM rules, then run the re-index from within it etc. I had no errors reported during the update/upgrade process.

2021-03-30T11:19:28.545426332Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:28+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:30.059347241Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:30+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:31.564753614Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:31+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:33.071856753Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:33+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:34.577197149Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:34+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:36.084257111Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:36+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:37.590204949Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:37+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:39.100106117Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:39+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:40.605606869Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:40+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:42.112627282Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:42+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:43.619688005Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:43+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:45.136420214Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:45+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:46.642719643Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:46+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:48.151730866Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:48+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:49.657455609Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:49+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:51.163268273Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:51+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:52.667869315Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:52+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:54.175027246Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:54+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:55.680351240Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:55+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T11:19:57.188665534Z lme_kibana.1.4fb1unckfubo@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T11:19:57+00:00","tags":["info","savedobjects-service"],"pid":7,"message":"Detected mapping change in \"properties.originId\""}

Thanks Michael

a-d-a-m-b commented 3 years ago

Hi Michael - could you run sudo docker service logs lme_kibana --tail 200 --timestamps please?

Tailing the last 20 lines isn't much help here as all the lines are the same :)

MichaelGibsonAltrad commented 3 years ago

Hi Adam,

I've ran that and all 200 are identical except for the date/time column.

Regards Michael

a-d-a-m-b commented 3 years ago

sudo docker service logs lme_kibana --tail 2000 --timestamps ?

I suspect that the line before the one's we have seen in this Github issue is akin to Another Kibana instance appears to be migrating the index. ... Just looking for confirmation of what the root issue seems to be.

MichaelGibsonAltrad commented 3 years ago

Even with 2000 it's exactly the same. Would a restart and then log query help, as I would expect that to give use something different as it starts up?

a-d-a-m-b commented 3 years ago

Yes please 👍

lme_update.sh and then immediately after, tailing the logs should provide some more helpful log entries

MichaelGibsonAltrad commented 3 years ago

There's definitely a bug in the script as after every restart I see another three broken docker lines when getting the status

I've restarted and see this gives other log entries

xxxxxx@xxxxxx:~$ sudo docker stack ps lme
ID                  NAME                  IMAGE                                                  NODE                DESIRED STATE       CURRENT STATE                ERROR                              PORTS
iftgn726fttw        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxx           Running             Running about a minute ago
sufw2gerr37e        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxx           Running             Running about a minute ago
dcvg532ovobm        lme_kibana.1          docker.elastic.co/kibana/kibana:7.11.2                 xxxxxx           Running             Running about a minute ago
8rdri1lps0zn        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxx           Shutdown            Failed about a minute ago    "No such container: lme_elasti…"
4fb1unckfubo        lme_kibana.1          docker.elastic.co/kibana/kibana:7.11.2                 xxxxxx           Shutdown            Failed about a minute ago    "No such container: lme_kibana…"
jkpo0c8rjvpa        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxx           Shutdown            Failed about a minute ago    "task: non-zero exit (137)"
dh777k2ohlgs         \_ lme_logstash.1    docker.elastic.co/logstash/logstash:7.11.2             xxxxxx           Shutdown            Failed 2 hours ago           "No such container: lme_logsta…"
w5xj5da0yg8r        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxx           Shutdown            Failed 2 hours ago           "No such container: lme_elasti…"
rv5yu9nfz8i5        lme_kibana.1          docker.elastic.co/kibana/kibana:7.11.2                 xxxxxx           Shutdown            Failed 2 hours ago           "No such container: lme_kibana…"
rimsekbiufow         \_ lme_kibana.1      docker.elastic.co/kibana/kibana:7.11.2                 xxxxxx           Shutdown            Failed 4 hours ago           "No such container: lme_kibana…"
f78ofjrw2srf        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxx           Shutdown            Failed 4 hours ago           "No such container: lme_elasti…"
10d6tx9ds5ti        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxx           Shutdown            Failed 4 hours ago           "No such container: lme_logsta…"
xxxxxx@xxxxxx:~$ sudo docker service logs lme_kibana --tail 200 --timestamps
2021-03-30T12:45:14.017146445Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:14+00:00","tags":["info","plugins-service"],"pid":8,"message":"Plugin \"visTypeXy\" is disabled."}
2021-03-30T12:45:14.140010939Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:14+00:00","tags":["warning","config","deprecation"],"pid":8,"message":"Setting [elasticsearch.username] to \"kibana\" is deprecated. You should use the \"kibana_system\" user instead."}
2021-03-30T12:45:14.140417248Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:14+00:00","tags":["warning","config","deprecation"],"pid":8,"message":"Config key [monitoring.cluster_alerts.email_notifications.email_address] will be required for email notifications to work in 8.0.\""}
2021-03-30T12:45:14.140730555Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:14+00:00","tags":["warning","config","deprecation"],"pid":8,"message":"Setting [monitoring.username] to \"kibana\" is deprecated. You should use the \"kibana_system\" user instead."}
2021-03-30T12:45:14.582644045Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:14+00:00","tags":["info","plugins-system"],"pid":8,"message":"Setting up [101] plugins: [licensing,globalSearch,globalSearchProviders,taskManager,code,usageCollection,xpackLegacy,telemetryCollectionManager,telemetry,telemetryCollectionXpack,kibanaUsageCollection,securityOss,newsfeed,mapsLegacy,kibanaLegacy,translations,legacyExport,share,esUiShared,expressions,charts,embeddable,uiActionsEnhanced,bfetch,data,home,observability,console,consoleExtensions,apmOss,searchprofiler,painlessLab,grokdebugger,management,indexPatternManagement,advancedSettings,fileUpload,savedObjects,visualizations,visTypeTimelion,timelion,features,licenseManagement,graph,dataEnhanced,watcher,canvas,visTypeVislib,visTypeTimeseries,visTypeTimeseriesEnhanced,visTypeVega,visTypeMetric,visTypeTable,visTypeTagcloud,visTypeMarkdown,tileMap,regionMap,lensOss,inputControlVis,mapsOss,dashboard,dashboardEnhanced,visualize,discover,discoverEnhanced,savedObjectsManagement,spaces,security,savedObjectsTagging,maps,lens,reporting,lists,encryptedSavedObjects,dashboardMode,cloud,upgradeAssistant,snapshotRestore,fleet,indexManagement,rollup,remoteClusters,crossClusterReplication,indexLifecycleManagement,enterpriseSearch,ml,beatsManagement,transform,ingestPipelines,eventLog,actions,alerts,triggersActionsUi,securitySolution,case,stackAlerts,infra,monitoring,logstash,apm,uptime]"}
2021-03-30T12:45:14.595753732Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:14+00:00","tags":["info","plugins","taskManager"],"pid":8,"message":"TaskManager is identified by the Kibana UUID: 6222042e-7c74-4eab-a42e-cd16f56772bf"}
2021-03-30T12:45:15.054754794Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","plugins","security","config"],"pid":8,"message":"Generating a random key for xpack.security.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.security.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command."}
2021-03-30T12:45:15.131946484Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","plugins","reporting","config"],"pid":8,"message":"Generating a random key for xpack.reporting.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.reporting.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command."}
2021-03-30T12:45:15.132933205Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","plugins","reporting","config"],"pid":8,"message":"Found 'server.host: \"0\"' in Kibana configuration. This is incompatible with Reporting. To enable Reporting to work, 'xpack.reporting.kibanaServer.hostname: 0.0.0.0' is being automatically to the configuration. You can change the setting to 'server.host: 0.0.0.0' or add 'xpack.reporting.kibanaServer.hostname: 0.0.0.0' in kibana.yml to prevent this message."}
2021-03-30T12:45:15.133698322Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","plugins","reporting","config"],"pid":8,"message":"Chromium sandbox provides an additional layer of protection, but is not supported for Linux CentOS 8.3.2011\n OS. Automatically setting 'xpack.reporting.capture.browser.chromium.disableSandbox: true'."}
2021-03-30T12:45:15.332566474Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["info","plugins","monitoring","monitoring"],"pid":8,"message":"config sourced from: production cluster"}
2021-03-30T12:45:15.668627229Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Waiting until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations..."}
2021-03-30T12:45:15.705354133Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["error","elasticsearch","monitoring"],"pid":8,"message":"Request error, retrying\nGET https://elasticsearch:9200/_xpack?accept_enterprise=true => connect ECONNREFUSED 10.0.2.5:9200"}
2021-03-30T12:45:15.711841575Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","elasticsearch","monitoring"],"pid":8,"message":"Unable to revive connection: https://elasticsearch:9200/"}
2021-03-30T12:45:15.717190992Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","elasticsearch","monitoring"],"pid":8,"message":"No living connections"}
2021-03-30T12:45:15.717920108Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: No Living connections error"}
2021-03-30T12:45:15.719653346Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","plugins","monitoring","monitoring"],"pid":8,"message":"X-Pack Monitoring Cluster Alerts will not be available: No Living connections"}
2021-03-30T12:45:15.763310802Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["error","savedobjects-service"],"pid":8,"message":"Unable to retrieve version information from Elasticsearch nodes."}
2021-03-30T12:45:45.954796229Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:45+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Starting saved objects migrations"}
2021-03-30T12:45:46.055711715Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:46+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: search_phase_execution_exception"}
2021-03-30T12:45:46.057743257Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:46+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: search_phase_execution_exception"}
2021-03-30T12:45:48.596252299Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T12:45:48.596847211Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Creating index .kibana_task_manager_2."}
2021-03-30T12:45:48.601773212Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: cluster_block_exception"}
2021-03-30T12:45:48.606237204Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["fatal","root"],"pid":8,"message":"ResponseError: cluster_block_exception\n    at onBody (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:333:23)\n    at IncomingMessage.onEnd (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:260:11)\n    at IncomingMessage.emit (events.js:327:22)\n    at endReadableNT (internal/streams/readable.js:1327:12)\n    at processTicksAndRejections (internal/process/task_queues.js:80:21) {\n  meta: {\n    body: { error: [Object], status: 429 },\n    statusCode: 429,\n    headers: {\n      'content-type': 'application/json; charset=UTF-8',\n      'content-length': '427'\n    },\n    meta: {\n      context: null,\n      request: [Object],\n      name: 'elasticsearch-js',\n      connection: [Object],\n      attempts: 0,\n      aborted: false\n    }\n  }\n}"}
2021-03-30T12:45:48.606798516Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","plugins-system"],"pid":8,"message":"Stopping all plugins."}
2021-03-30T12:45:48.611750117Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","plugins","monitoring","monitoring","kibana-monitoring"],"pid":8,"message":"Monitoring stats collection is stopped"}
2021-03-30T12:45:48.622727643Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Creating index .kibana_2."}
2021-03-30T12:45:48.634128278Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: cluster_block_exception"}
2021-03-30T12:46:15.690586720Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:46:15+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: Cluster client cannot be used after it has been closed. error"}
2021-03-30T12:46:45.690576357Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:46:45+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: Cluster client cannot be used after it has been closed. error"}
2021-03-30T12:47:15.692175276Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:47:15+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: Cluster client cannot be used after it has been closed. error"}
MichaelGibsonAltrad commented 3 years ago

Sorry, I pasted before seeing your last post. Here's the info after running the script again:

2021-03-30T12:45:15.332566474Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["info","plugins","monitoring","monitoring"],"pid":8,"message":"config sourced from: production cluster"}
2021-03-30T12:45:15.668627229Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Waiting until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations..."}
2021-03-30T12:45:15.705354133Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["error","elasticsearch","monitoring"],"pid":8,"message":"Request error, retrying\nGET https://elasticsearch:9200/_xpack?accept_enterprise=true => connect ECONNREFUSED 10.0.2.5:9200"}
2021-03-30T12:45:15.711841575Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","elasticsearch","monitoring"],"pid":8,"message":"Unable to revive connection: https://elasticsearch:9200/"}
2021-03-30T12:45:15.717190992Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","elasticsearch","monitoring"],"pid":8,"message":"No living connections"}
2021-03-30T12:45:15.717920108Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: No Living connections error"}
2021-03-30T12:45:15.719653346Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["warning","plugins","monitoring","monitoring"],"pid":8,"message":"X-Pack Monitoring Cluster Alerts will not be available: No Living connections"}
2021-03-30T12:45:15.763310802Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:15+00:00","tags":["error","savedobjects-service"],"pid":8,"message":"Unable to retrieve version information from Elasticsearch nodes."}
2021-03-30T12:45:45.954796229Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:45+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Starting saved objects migrations"}
2021-03-30T12:45:46.055711715Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:46+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: search_phase_execution_exception"}
2021-03-30T12:45:46.057743257Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:46+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: search_phase_execution_exception"}
2021-03-30T12:45:48.596252299Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T12:45:48.596847211Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Creating index .kibana_task_manager_2."}
2021-03-30T12:45:48.601773212Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: cluster_block_exception"}
2021-03-30T12:45:48.606237204Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["fatal","root"],"pid":8,"message":"ResponseError: cluster_block_exception\n    at onBody (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:333:23)\n    at IncomingMessage.onEnd (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:260:11)\n    at IncomingMessage.emit (events.js:327:22)\n    at endReadableNT (internal/streams/readable.js:1327:12)\n    at processTicksAndRejections (internal/process/task_queues.js:80:21) {\n  meta: {\n    body: { error: [Object], status: 429 },\n    statusCode: 429,\n    headers: {\n      'content-type': 'application/json; charset=UTF-8',\n      'content-length': '427'\n    },\n    meta: {\n      context: null,\n      request: [Object],\n      name: 'elasticsearch-js',\n      connection: [Object],\n      attempts: 0,\n      aborted: false\n    }\n  }\n}"}
2021-03-30T12:45:48.606798516Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","plugins-system"],"pid":8,"message":"Stopping all plugins."}
2021-03-30T12:45:48.611750117Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","plugins","monitoring","monitoring","kibana-monitoring"],"pid":8,"message":"Monitoring stats collection is stopped"}
2021-03-30T12:45:48.622727643Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Creating index .kibana_2."}
2021-03-30T12:45:48.634128278Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:45:48+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: cluster_block_exception"}
2021-03-30T12:46:15.690586720Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:46:15+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: Cluster client cannot be used after it has been closed. error"}
2021-03-30T12:46:45.690576357Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:46:45+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: Cluster client cannot be used after it has been closed. error"}
2021-03-30T12:47:15.692175276Z lme_kibana.1.dcvg532ovobm@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:47:15+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: Cluster client cannot be used after it has been closed. error"}
xxxxxx@xxxxxx:~$ cd /opt/lme
xxxxxx@xxxxxx:/opt/lme$ ls
 backups  'Chapter 1 Files'  'Chapter 2 Files'  'Chapter 3 Files'  'Chapter 4 Files'   dashboard_update.sh   docs   files_for_windows.zip   LICENSE   lme.conf   lme_update.sh   README.md
xxxxxx@xxxxxx:/opt/lme$ lme_update.sh
lme_update.sh: command not found
xxxxxx@xxxxxx:/opt/lme$ sudo ./lme_update.sh
[x] Updating from git repo
Already up to date.
[x] Removing existing docker stack
Removing service lme_elasticsearch
Removing service lme_kibana
Removing service lme_logstash
Removing network lme_esnet
logstash.conf
logstash_custom.conf
[x] Attempting to remove legacy LME files (this will cause expected errors if these no longer exist)
Error: No such config: osmap.csv
[x] Sleeping for one minute to allow docker actions to complete...
[x] Updating current configuration files
[x] Custom logstash config exists, Not creating
[x] Recreating docker stack
wnqtnnny271o120uaps0gttg1
u84v7joplisxf1ofovxrgi143
Creating network lme_esnet
Creating service lme_logstash
Creating service lme_elasticsearch
Creating service lme_kibana
xxxxxx@xxxxxx:/opt/lme$ clear
xxxxxx@xxxxxx:/opt/lme$ sudo docker stack ps lme
ID                  NAME                  IMAGE                                                  NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
x3ct4t5zu18s        lme_kibana.1          docker.elastic.co/kibana/kibana:7.11.2                 xxxxxx           Running             Running 9 seconds ago
4y83jww8jg1l        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxx           Running             Running 11 seconds ago
zt8ivzw5rnqj        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxx           Running             Running 12 seconds ago
xxxxxx@xxxxxx:/opt/lme$ sudo docker service logs lme_kibana --tail 200 --timestamps
2021-03-30T12:52:39.210308613Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:39+00:00","tags":["info","plugins-service"],"pid":9,"message":"Plugin \"visTypeXy\" is disabled."}
2021-03-30T12:52:39.401203638Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:39+00:00","tags":["warning","config","deprecation"],"pid":9,"message":"Setting [elasticsearch.username] to \"kibana\" is deprecated. You should use the \"kibana_system\" user instead."}
2021-03-30T12:52:39.401727950Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:39+00:00","tags":["warning","config","deprecation"],"pid":9,"message":"Config key [monitoring.cluster_alerts.email_notifications.email_address] will be required for email notifications to work in 8.0.\""}
2021-03-30T12:52:39.402134059Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:39+00:00","tags":["warning","config","deprecation"],"pid":9,"message":"Setting [monitoring.username] to \"kibana\" is deprecated. You should use the \"kibana_system\" user instead."}
2021-03-30T12:52:39.738671260Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:39+00:00","tags":["info","plugins-system"],"pid":9,"message":"Setting up [101] plugins: [taskManager,licensing,globalSearch,globalSearchProviders,code,usageCollection,xpackLegacy,telemetryCollectionManager,telemetry,telemetryCollectionXpack,kibanaUsageCollection,securityOss,newsfeed,mapsLegacy,kibanaLegacy,translations,share,legacyExport,embeddable,uiActionsEnhanced,expressions,charts,esUiShared,bfetch,data,home,observability,console,consoleExtensions,apmOss,searchprofiler,painlessLab,grokdebugger,management,indexPatternManagement,advancedSettings,fileUpload,savedObjects,visualizations,visTypeVislib,visTypeTagcloud,visTypeVega,visTypeTimeseries,visTypeTimeseriesEnhanced,visTypeTimelion,features,licenseManagement,dataEnhanced,watcher,canvas,visTypeTable,visTypeMetric,visTypeMarkdown,tileMap,regionMap,mapsOss,lensOss,inputControlVis,graph,timelion,dashboard,dashboardEnhanced,visualize,discover,discoverEnhanced,savedObjectsManagement,spaces,security,savedObjectsTagging,maps,lens,reporting,lists,encryptedSavedObjects,dashboardMode,cloud,upgradeAssistant,snapshotRestore,fleet,indexManagement,rollup,remoteClusters,crossClusterReplication,indexLifecycleManagement,enterpriseSearch,ml,beatsManagement,transform,ingestPipelines,eventLog,actions,alerts,triggersActionsUi,stackAlerts,securitySolution,case,infra,monitoring,logstash,apm,uptime]"}
2021-03-30T12:52:39.745611821Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:39+00:00","tags":["info","plugins","taskManager"],"pid":9,"message":"TaskManager is identified by the Kibana UUID: 0ce9a986-0d09-4007-b033-95b67adf7585"}
2021-03-30T12:52:40.123999786Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["warning","plugins","security","config"],"pid":9,"message":"Generating a random key for xpack.security.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.security.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command."}
2021-03-30T12:52:40.194195010Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["warning","plugins","reporting","config"],"pid":9,"message":"Generating a random key for xpack.reporting.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.reporting.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command."}
2021-03-30T12:52:40.194889226Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["warning","plugins","reporting","config"],"pid":9,"message":"Found 'server.host: \"0\"' in Kibana configuration. This is incompatible with Reporting. To enable Reporting to work, 'xpack.reporting.kibanaServer.hostname: 0.0.0.0' is being automatically to the configuration. You can change the setting to 'server.host: 0.0.0.0' or add 'xpack.reporting.kibanaServer.hostname: 0.0.0.0' in kibana.yml to prevent this message."}
2021-03-30T12:52:40.196821471Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["warning","plugins","reporting","config"],"pid":9,"message":"Chromium sandbox provides an additional layer of protection, but is not supported for Linux CentOS 8.3.2011\n OS. Automatically setting 'xpack.reporting.capture.browser.chromium.disableSandbox: true'."}
2021-03-30T12:52:40.419756029Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["info","plugins","monitoring","monitoring"],"pid":9,"message":"config sourced from: production cluster"}
2021-03-30T12:52:40.749073049Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["info","savedobjects-service"],"pid":9,"message":"Waiting until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations..."}
2021-03-30T12:52:40.772904200Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["error","elasticsearch","monitoring"],"pid":9,"message":"Request error, retrying\nGET https://elasticsearch:9200/_xpack?accept_enterprise=true => connect ECONNREFUSED 10.0.1.5:9200"}
2021-03-30T12:52:40.779886361Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["warning","elasticsearch","monitoring"],"pid":9,"message":"Unable to revive connection: https://elasticsearch:9200/"}
2021-03-30T12:52:40.780453375Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["warning","elasticsearch","monitoring"],"pid":9,"message":"No living connections"}
2021-03-30T12:52:40.781213892Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["warning","plugins","licensing"],"pid":9,"message":"License information could not be obtained from Elasticsearch due to Error: No Living connections error"}
2021-03-30T12:52:40.783011734Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["warning","plugins","monitoring","monitoring"],"pid":9,"message":"X-Pack Monitoring Cluster Alerts will not be available: No Living connections"}
2021-03-30T12:52:40.810946980Z lme_kibana.1.x3ct4t5zu18s@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T12:52:40+00:00","tags":["error","savedobjects-service"],"pid":9,"message":"Unable to retrieve version information from Elasticsearch nodes."}
xxxxxx@xxxxxx:/opt/lme$
a-d-a-m-b commented 3 years ago

Is it possible to do an update again, and then have the logs followed? (so sudo docker service logs lme_kibana --follow --timestamps) until we see the Detected mapping change in \"properties.originId\" log lines again?

Thanks for your patience with this. It feels like we have the beginning and end, but we seem to be missing the trigger of the error.

MichaelGibsonAltrad commented 3 years ago

I'm following the logs and it appears to be stuck on this now, as it just repeats over and over again:

2021-03-30T13:27:51.017029917Z lme_kibana.1.lp0a0szuelqk@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T13:27:51+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: Cluster client cannot be used after it has been closed. error"}
MichaelGibsonAltrad commented 3 years ago

I've restarted the server and this is the full set of tail logs up until the licensing error:

login as: xxxxxx
xxxxxx@xxxxxx's password:
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 4.15.0-140-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Tue Mar 30 14:32:38 UTC 2021

  System load:  2.04                Users logged in:                0
  Usage of /:   89.2% of 195.86GB   IP address for eth0:            10.44.3.238
  Memory usage: 3%                  IP address for docker_gwbridge: 172.18.0.1
  Swap usage:   0%                  IP address for docker0:         172.17.0.1
  Processes:    132

  => / is using 89.2% of 195.86GB

 * Introducing self-healing high availability clusters in MicroK8s.
   Simple, hardened, Kubernetes for production, from RaspberryPi to DC.

     https://microk8s.io/high-availability

 * Canonical Livepatch is available for installation.
   - Reduce system reboots and improve kernel security. Activate at:
     https://ubuntu.com/livepatch

65 packages can be updated.
0 updates are security updates.

Last login: Tue Mar 30 13:07:10 2021
xxxxxx@xxxxxx:~$ sudo docker stack ps lme
[sudo] password for xxxxxx:
ID                  NAME                  IMAGE                                                  NODE                DESIRED STATE       CURRENT STATE            ERROR                              PORTS
b1jeulypu912        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxx           Running             Running 29 seconds ago
7frt9d8a12xr        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxx           Running             Running 30 seconds ago
6p6thsj6ei47        lme_kibana.1          docker.elastic.co/kibana/kibana:7.11.2                 xxxxxx           Running             Running 29 seconds ago
sxhbgde9l2ua        lme_logstash.1        docker.elastic.co/logstash/logstash:7.11.2             xxxxxx           Shutdown            Failed 43 seconds ago    "No such container: lme_logsta…"
lp0a0szuelqk        lme_kibana.1          docker.elastic.co/kibana/kibana:7.11.2                 xxxxxx           Shutdown            Failed 43 seconds ago    "No such container: lme_kibana…"
yf06y81xjgwl        lme_elasticsearch.1   docker.elastic.co/elasticsearch/elasticsearch:7.11.2   xxxxxx           Shutdown            Failed 43 seconds ago    "No such container: lme_elasti…"
xxxxxx@xxxxxx:~$ sudo docker service logs lme_kibana --follow --timestamps
2021-03-30T14:32:50.428532056Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:50+00:00","tags":["info","plugins-service"],"pid":8,"message":"Plugin \"visTypeXy\" is disabled."}
2021-03-30T14:32:50.570087495Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:50+00:00","tags":["warning","config","deprecation"],"pid":8,"message":"Setting [elasticsearch.username] to \"kibana\" is deprecated. You should use the \"kibana_system\" user instead."}
2021-03-30T14:32:50.571603290Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:50+00:00","tags":["warning","config","deprecation"],"pid":8,"message":"Config key [monitoring.cluster_alerts.email_notifications.email_address] will be required for email notifications to work in 8.0.\""}
2021-03-30T14:32:50.571895689Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:50+00:00","tags":["warning","config","deprecation"],"pid":8,"message":"Setting [monitoring.username] to \"kibana\" is deprecated. You should use the \"kibana_system\" user instead."}
2021-03-30T14:32:51.179838308Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:51+00:00","tags":["info","plugins-system"],"pid":8,"message":"Setting up [101] plugins: [taskManager,licensing,globalSearch,globalSearchProviders,code,usageCollection,xpackLegacy,telemetryCollectionManager,telemetry,telemetryCollectionXpack,kibanaUsageCollection,securityOss,newsfeed,mapsLegacy,kibanaLegacy,translations,share,esUiShared,legacyExport,expressions,charts,embeddable,uiActionsEnhanced,bfetch,data,home,observability,console,consoleExtensions,apmOss,painlessLab,searchprofiler,grokdebugger,management,indexPatternManagement,advancedSettings,fileUpload,savedObjects,visualizations,inputControlVis,visTypeTimeseries,visTypeTimeseriesEnhanced,visTypeVislib,visTypeVega,visTypeMetric,visTypeTagcloud,visTypeTable,visTypeTimelion,features,licenseManagement,dataEnhanced,watcher,canvas,tileMap,visTypeMarkdown,regionMap,mapsOss,lensOss,graph,timelion,dashboard,dashboardEnhanced,visualize,discover,discoverEnhanced,savedObjectsManagement,spaces,security,reporting,savedObjectsTagging,lens,maps,lists,encryptedSavedObjects,dashboardMode,cloud,upgradeAssistant,snapshotRestore,fleet,indexManagement,remoteClusters,crossClusterReplication,rollup,indexLifecycleManagement,enterpriseSearch,ml,beatsManagement,transform,ingestPipelines,eventLog,actions,alerts,triggersActionsUi,stackAlerts,securitySolution,case,infra,monitoring,logstash,apm,uptime]"}
2021-03-30T14:32:51.183722495Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:51+00:00","tags":["info","plugins","taskManager"],"pid":8,"message":"TaskManager is identified by the Kibana UUID: 7c93df11-5e5a-4364-b84a-b862670aecd0"}
2021-03-30T14:32:51.626590754Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:51+00:00","tags":["warning","plugins","security","config"],"pid":8,"message":"Generating a random key for xpack.security.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.security.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command."}
2021-03-30T14:32:51.665487728Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:51+00:00","tags":["warning","plugins","reporting","config"],"pid":8,"message":"Generating a random key for xpack.reporting.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.reporting.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command."}
2021-03-30T14:32:51.666070126Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:51+00:00","tags":["warning","plugins","reporting","config"],"pid":8,"message":"Found 'server.host: \"0\"' in Kibana configuration. This is incompatible with Reporting. To enable Reporting to work, 'xpack.reporting.kibanaServer.hostname: 0.0.0.0' is being automatically to the configuration. You can change the setting to 'server.host: 0.0.0.0' or add 'xpack.reporting.kibanaServer.hostname: 0.0.0.0' in kibana.yml to prevent this message."}
2021-03-30T14:32:51.667162922Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:51+00:00","tags":["warning","plugins","reporting","config"],"pid":8,"message":"Chromium sandbox provides an additional layer of protection, but is not supported for Linux CentOS 8.3.2011\n OS. Automatically setting 'xpack.reporting.capture.browser.chromium.disableSandbox: true'."}
2021-03-30T14:32:51.909273134Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:51+00:00","tags":["info","plugins","monitoring","monitoring"],"pid":8,"message":"config sourced from: production cluster"}
2021-03-30T14:32:52.205759371Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:52+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Waiting until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations..."}
2021-03-30T14:32:52.269282365Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:52+00:00","tags":["error","elasticsearch","monitoring"],"pid":8,"message":"Request error, retrying\nGET https://elasticsearch:9200/_xpack?accept_enterprise=true => connect ECONNREFUSED 10.0.2.2:9200"}
2021-03-30T14:32:52.273492551Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:52+00:00","tags":["warning","elasticsearch","monitoring"],"pid":8,"message":"Unable to revive connection: https://elasticsearch:9200/"}
2021-03-30T14:32:52.273813650Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:52+00:00","tags":["warning","elasticsearch","monitoring"],"pid":8,"message":"No living connections"}
2021-03-30T14:32:52.276410442Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:52+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: No Living connections error"}
2021-03-30T14:32:52.283749718Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:52+00:00","tags":["warning","plugins","monitoring","monitoring"],"pid":8,"message":"X-Pack Monitoring Cluster Alerts will not be available: No Living connections"}
2021-03-30T14:32:52.312741724Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:32:52+00:00","tags":["error","savedobjects-service"],"pid":8,"message":"Unable to retrieve version information from Elasticsearch nodes."}
2021-03-30T14:33:22.486797325Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:22+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Starting saved objects migrations"}
2021-03-30T14:33:22.666115976Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:22+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: search_phase_execution_exception"}
2021-03-30T14:33:22.671589259Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:22+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: search_phase_execution_exception"}
2021-03-30T14:33:25.236357424Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:25+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Detected mapping change in \"properties.originId\""}
2021-03-30T14:33:25.237236322Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:25+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Creating index .kibana_task_manager_2."}
2021-03-30T14:33:25.240538512Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:25+00:00","tags":["info","savedobjects-service"],"pid":8,"message":"Creating index .kibana_2."}
2021-03-30T14:33:25.243822402Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:25+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: cluster_block_exception"}
2021-03-30T14:33:25.248282488Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:25+00:00","tags":["fatal","root"],"pid":8,"message":"ResponseError: cluster_block_exception\n    at onBody (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:333:23)\n    at IncomingMessage.onEnd (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:260:11)\n    at IncomingMessage.emit (events.js:327:22)\n    at endReadableNT (internal/streams/readable.js:1327:12)\n    at processTicksAndRejections (internal/process/task_queues.js:80:21) {\n  meta: {\n    body: { error: [Object], status: 429 },\n    statusCode: 429,\n    headers: {\n      'content-type': 'application/json; charset=UTF-8',\n      'content-length': '427'\n    },\n    meta: {\n      context: null,\n      request: [Object],\n      name: 'elasticsearch-js',\n      connection: [Object],\n      attempts: 0,\n      aborted: false\n    }\n  }\n}"}
2021-03-30T14:33:25.249414085Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:25+00:00","tags":["info","plugins-system"],"pid":8,"message":"Stopping all plugins."}
2021-03-30T14:33:25.253191173Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:25+00:00","tags":["info","plugins","monitoring","monitoring","kibana-monitoring"],"pid":8,"message":"Monitoring stats collection is stopped"}
2021-03-30T14:33:25.261293549Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:25+00:00","tags":["warning","savedobjects-service"],"pid":8,"message":"Unable to connect to Elasticsearch. Error: cluster_block_exception"}
2021-03-30T14:33:52.225815619Z lme_kibana.1.6p6thsj6ei47@xxxxxx    | {"type":"log","@timestamp":"2021-03-30T14:33:52+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to Error: Cluster client cannot be used after it has been closed. error"}
adam-ncc commented 3 years ago

Thanks Michael, it looks as if the upgrade process itself has completed but Kibana has run into an issue when updating from the previous supported version (7.8.0) to the latest supported version (7.11.2). We're just investigating now to see if we can reproduce this on our end and will get back to you ASAP if we need further information or once can identify a root cause. Appreciate your help in trying to get to the bottom of this so far.

MichaelGibsonAltrad commented 3 years ago

Thanks Adam. Shall I roll this back to before the upgrade, or would it be useful to leave as-is.

adam-ncc commented 3 years ago

Feel free to roll back for now if you're comfortable doing so and you need immediate access to the LME instance, you may have more success retrying the upgrade afterwards, but if you want to hold off for now then we will keep investigating. I've unfortunately been unable to reproduce the issue on our end thus far but will keep trying.

If you don't mind I'd appreciate if you could check the current status of your Elastic container as the logs seem to suggest the issue is with requests from Kibana to Elastic. Could you confirm if you're able to successfuly access the Elastic instance by navigating to https://[your-lme-hostname]:9200, and if so could you check the status of the cluster by accessing https://[your-lme-hostname]:9200/_cluster/health and let us know what the status is? Any issues in your Elastic instance might help us pin down the problem.

It might also be helpful to have a sample of the Elastic logs from docker (sudo docker service logs lme_elasticsearch --tail 100 --timestamps) from after a restart/update if that would be possible?

splurggy commented 3 years ago

had a similar issue and had to totally re install everything - even lost all docker containers for some reason - nothing was done on the Ubuntu Box - it just stopped generating logs - watch out for Java thou if this updates for any reason it will cause issues with Elastic.

adam-ncc commented 3 years ago

Hi @splurggy - sorry to hear it did not work for you either. Would you able to post some more details of the steps you took and the point where you encountered an issue? If we can identify a common issue or error in the logs between yourself and Michael it might point us at the underlying problem, which we can't reproduce on any of our test instances so far.

The Ubuntu box collects and stores the logs, but they should be generated and fowarded from the Windows Event Collector server, so I'm unsure whether you mean you weren't recieving forwarded events into Elastic or whether the WEC server itself was no longer recieving events? Additionally I'm curious what the issue with Java you experienced was, as the ELK stack is run within Docker containers so it should in theory have no dependency on the underlying Java version of the Ubuntu host, so any more information you can provide would be useful!

splurggy commented 3 years ago

Sorry dude i completed a new install now and that seems to work really well

On Wed, Mar 31, 2021 at 4:31 PM adam-ncc @.***> wrote:

Hi @splurggy https://github.com/splurggy - sorry to hear it did not work for you either. Would you able to post some more details of the steps you took and the point where you encountered an issue? If we can identify a common issue or error in the logs between yourself and Michael it might point us at the underlying problem, which we can't reproduce on any of our test instances so far.

The Ubuntu box collects and stores the logs, but they should be generated and fowarded from the Windows Event Collector server, so I'm unsure whether you mean you weren't recieving forwarded events into Elastic or whether the WEC server itself was no longer recieving events? Additionally I'm curious what the issue with Java you experienced was, as the ELK stack is run within Docker containers so it should in theory have no dependency on the underlying Java version of the Ubuntu host, so any more information you can provide would be useful!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ukncsc/lme/issues/77#issuecomment-811159128, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASL6IZQEFOH5LHLGCBDL4M3TGM56JANCNFSM4RZH6VYQ .

-- Regards

Stuart Beeson

adam-ncc commented 3 years ago

Ah ok no worries pleased to hear it's working now for you at least. Can I ask if this is a new install on the 0.4 pre-release branch with the relevant updates or back onto the existing master branch?

MichaelGibsonAltrad commented 3 years ago

Hi Adam,

We're happy to leave as-is for the moment, as this is new to us so we were just seeing what was surfaced and with some of the reports not working, it wasn't a complete picture.

I can access the elastic web page and this is the response:

{
  "name" : "es01",
  "cluster_name" : "loggingmadeeasy-es",
  "cluster_uuid" : "1tdZPldmRPiqEt6CLmKKvA",
  "version" : {
    "number" : "7.11.2",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "3e5a16cfec50876d20ea77b075070932c6464c7d",
    "build_date" : "2021-03-06T05:54:38.141101Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

and here is the cluster health

{"cluster_name":"loggingmadeeasy-es","status":"green","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":288,"active_shards":288,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

Here are the log entries after a restart

lmeroot@xxxxxx:~$ sudo docker service logs lme_elasticsearch --tail 100 --timestamps
2021-04-06T11:21:13.452680173Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.xpack.security.rest.SecurityRestFilter.handleRequest(SecurityRestFilter.java:73) [x-pack-security-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452684973Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:247) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452689674Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:329) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452694274Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:180) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452698974Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.http.AbstractHttpServerTransport.dispatchRequest(AbstractHttpServerTransport.java:325) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452703574Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.http.AbstractHttpServerTransport.handleIncomingRequest(AbstractHttpServerTransport.java:390) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452714175Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.http.AbstractHttpServerTransport.incomingRequest(AbstractHttpServerTransport.java:307) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452719575Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:31) [transport-netty4-client-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452727875Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:17) [transport-netty4-client-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452732676Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452737476Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452742076Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452747376Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452752077Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.channelRead(Netty4HttpPipeliningHandler.java:47) [transport-netty4-client-7.11.2.jar:7.11.2]",
2021-04-06T11:21:13.452756777Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452761377Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452766077Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452770778Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452775378Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452779978Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452784678Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452789278Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452793979Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452798579Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452803379Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452808479Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452816280Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452821180Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452825780Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452830480Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452850581Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452855482Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452860182Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452864782Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452869382Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) [netty-handler-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452873982Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452878683Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452883283Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452888283Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452892983Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452897483Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452902284Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452906984Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518) [netty-handler-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452925585Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267) [netty-handler-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452933785Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314) [netty-handler-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452938785Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452943486Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452948186Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) [netty-codec-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452952986Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452957686Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452962487Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452967287Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452972587Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452977387Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452982187Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452986988Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.452991588Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.453007289Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.453011889Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.453016589Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.453021289Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.453025990Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.49.Final.jar:4.1.49.Final]",
2021-04-06T11:21:13.453030590Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at java.lang.Thread.run(Thread.java:832) [?:?]"] }
2021-04-06T11:21:13.453035390Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | {"type": "server", "timestamp": "2021-04-06T11:21:13,452Z", "level": "INFO", "component": "o.e.x.s.a.AuthenticationService", "cluster.name": "loggingmadeeasy-es", "node.name": "es01", "message": "Authentication of [kibana] was terminated by realm [reserved] - failed to authenticate user [kibana]", "cluster.uuid": "1tdZPldmRPiqEt6CLmKKvA", "node.id": "nH0Py7j4Rt23AYCDJZVsHg"  }
2021-04-06T11:21:15.948891196Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | {"type": "deprecation", "timestamp": "2021-04-06T11:21:15,948Z", "level": "DEPRECATION", "component": "o.e.d.x.s.a.e.ReservedRealm", "cluster.name": "loggingmadeeasy-es", "node.name": "es01", "message": "The user [kibana] is deprecated and will be removed in a future version of Elasticsearch. Please use the [kibana_system] user instead.", "cluster.uuid": "1tdZPldmRPiqEt6CLmKKvA", "node.id": "nH0Py7j4Rt23AYCDJZVsHg"  }
2021-04-06T11:21:16.144914897Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | {"type": "server", "timestamp": "2021-04-06T11:21:16,143Z", "level": "WARN", "component": "r.suppressed", "cluster.name": "loggingmadeeasy-es", "node.name": "es01", "message": "path: /.kibana_task_manager/_count, params: {index=.kibana_task_manager}", "cluster.uuid": "1tdZPldmRPiqEt6CLmKKvA", "node.id": "nH0Py7j4Rt23AYCDJZVsHg" ,
2021-04-06T11:21:16.144963399Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed",
2021-04-06T11:21:16.144969699Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:601) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.144974999Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:332) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.144980200Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:636) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.144985400Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:415) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.144990200Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:240) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.144995100Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:308) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.144999800Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145004601Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145015301Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
2021-04-06T11:21:16.145020201Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
2021-04-06T11:21:16.145025002Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at java.lang.Thread.run(Thread.java:832) [?:?]",
2021-04-06T11:21:16.145029602Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "Caused by: org.elasticsearch.action.NoShardAvailableActionException",
2021-04-06T11:21:16.145034902Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:448) ~[elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145039702Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:397) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145054703Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "... 7 more"] }
2021-04-06T11:21:16.145168208Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | {"type": "server", "timestamp": "2021-04-06T11:21:16,143Z", "level": "WARN", "component": "r.suppressed", "cluster.name": "loggingmadeeasy-es", "node.name": "es01", "message": "path: /.kibana/_count, params: {index=.kibana}", "cluster.uuid": "1tdZPldmRPiqEt6CLmKKvA", "node.id": "nH0Py7j4Rt23AYCDJZVsHg" ,
2021-04-06T11:21:16.145177909Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "stacktrace": ["org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed",
2021-04-06T11:21:16.145183009Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:601) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145187809Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:332) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145192610Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:636) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145197510Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:415) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145202310Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:240) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145207010Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:308) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145211810Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145216711Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145221511Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
2021-04-06T11:21:16.145237612Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
2021-04-06T11:21:16.145242412Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at java.lang.Thread.run(Thread.java:832) [?:?]",
2021-04-06T11:21:16.145247112Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "Caused by: org.elasticsearch.action.NoShardAvailableActionException",
2021-04-06T11:21:16.145251712Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:448) ~[elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145257013Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:397) [elasticsearch-7.11.2.jar:7.11.2]",
2021-04-06T11:21:16.145261813Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | "... 7 more"] }
2021-04-06T11:21:43.819731625Z lme_elasticsearch.1.prdeavgqdffd@xxxxxx    | {"type": "server", "timestamp": "2021-04-06T11:21:43,819Z", "level": "WARN", "component": "o.e.c.r.a.DiskThresholdMonitor", "cluster.name": "loggingmadeeasy-es", "node.name": "es01", "message": "high disk watermark [90%] exceeded on [nH0Py7j4Rt23AYCDJZVsHg][es01][/usr/share/elasticsearch/data/nodes/0] free: 13.7gb[7%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete", "cluster.uuid": "1tdZPldmRPiqEt6CLmKKvA", "node.id": "nH0Py7j4Rt23AYCDJZVsHg"  }
lmeroot@xxxxxx:~$
duncan-ncc commented 3 years ago

Hi Michael,

It looks like your server has run out of space from the log line {"type": "server", "timestamp": "2021-04-06T11:21:43,819Z", "level": "WARN", "component": "o.e.c.r.a.DiskThresholdMonitor", "cluster.name": "loggingmadeeasy-es", "node.name": "es01", "message": "high disk watermark [90%] exceeded on [nH0Py7j4Rt23AYCDJZVsHg][es01][/usr/share/elasticsearch/data/nodes/0] free: 13.7gb[7%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete", "cluster.uuid": "1tdZPldmRPiqEt6CLmKKvA", "node.id": "nH0Py7j4Rt23AYCDJZVsHg"

It might be worthwhile increasing the disk size and seeing if you are still experiencing issues, or delete old data if you no longer require the data.

Thanks, Duncan

MichaelGibsonAltrad commented 3 years ago

Hi Duncan,

This is what df -h returns and there appears to be plenty of free space, so what needs increasing and by how much?

Filesystem                         Size  Used Avail Use% Mounted on
udev                                16G     0   16G   0% /dev
tmpfs                              3.2G  772K  3.2G   1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv  196G  173G   14G  93% /
tmpfs                               16G     0   16G   0% /dev/shm
tmpfs                              5.0M     0  5.0M   0% /run/lock
tmpfs                               16G     0   16G   0% /sys/fs/cgroup
/dev/sda2                          976M  146M  763M  17% /boot
/dev/sda1                          511M  6.1M  505M   2% /boot/efi
overlay                            196G  173G   14G  93% /var/lib/docker/overlay2/8d6e8c9f0d47ccc36b73e3aa3268496a2013110dcae02079b8d05efde9bb9eb2/merged
overlay                            196G  173G   14G  93% /var/lib/docker/overlay2/0e9e4b82e5528b5cf0eacc3d4ab82f84bcccdcda17bf60072baeb14ffc422e53/merged
overlay                            196G  173G   14G  93% /var/lib/docker/overlay2/94145aa48ae644734e0879c96a792ef0a30b9361db3a90487d7d3e7298d2e8a4/merged
tmpfs                               16G   12K   16G   1% /var/lib/docker/containers/93a0dcede6a0baa2af767a630f5cda18645dc0d54f1a2b255fe50ed8c65da604/mounts/secrets
tmpfs                               16G   20K   16G   1% /var/lib/docker/containers/0e2c378d5287b3ec62d3fea8b059f8b9228f90c1d8dc0c9b4e4ffa49be2afccd/mounts/secrets
tmpfs                               16G   12K   16G   1% /var/lib/docker/containers/5a6eb05ea13305e16e42b6df0c29b2ebdd8a1714dc3061fcbdd583a048a48de3/mounts/secrets
tmpfs                              3.2G     0  3.2G   0% /run/user/1000

Thanks Michael

duncan-ncc commented 3 years ago

Hi Michael,

If you look at the main volume group mounted at "/" (size 200G) on its own...

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/ubuntu--vg-ubuntu--lv 196G 173G 14G 93% /

Your usage is 93% and elastic will not write data to disk after 90% usage by default.

The number of hosts that are sending data to your instance and the length of time for which you wish to retain the data willl impact the size of storage required.

Kind Regards, Duncan

MichaelGibsonAltrad commented 3 years ago

Hi Duncan,

I don't know Linux and there appears to be a way too complicated process for resizing a disk, so I'm going to wipe the full thing and start again when I have time. I hope this experience has provided you with some valuable info for your upgrade process.

Regards Michael

MichaelGibsonAltrad commented 3 years ago

I've rebuilt this from scratch and it was pain free, so thanks :) A couple of comments:

The command to switch to the lme directory for the beta branch is missing in 3.2, so it needs to look like this otherwise it says it can't find the git repository.

cd /opt/lme sudo git checkout 0.4-pre-release

I had to manually install the initial dashboards using the below command as documented in chapter 4, as they weren't available after the install.

cd /opt/lme sudo ./dashboard_update.sh

adam-ncc commented 3 years ago

That's excellent news, pleased to hear a new install is working for you! If you captured the output from the upload dashboards section of the initial install we can have a look into why they might not have been available, but I'm glad that the manual backup option worked well for you.

If you don't mind we'd appreciate if you could let us know over the next few weeks how you get on and if everything is still working well for you, or any unexpected issues you may encounter with the new version. This will help us ensure everything is as reliable as possible as we work towards the main release.

Thanks!

splurggy commented 3 years ago

My new install stopped working after a week.

On Tue, 13 Apr 2021, 16:22 adam-ncc, @.***> wrote:

That's excellent news, pleased to hear a new install is working for you! If you captured the output from the upload dashboards section of the initial install we can have a look into why they might not have been available, but I'm glad that the manual backup option worked well for you.

If you don't mind we'd appreciate if you could let us know over the next few weeks how you get on and if everything is still working well for you, or any unexpected issues you may encounter with the new version. This will help us ensure everything is as reliable as possible as we work towards the main release.

Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ukncsc/lme/issues/77#issuecomment-818822542, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASL6IZSOELSQI7FTDG5BLADTIROTRANCNFSM4RZH6VYQ .

duncan-ncc commented 3 years ago

@splurggy - Are you able to provide anymore information? There isn't much we can really do with the above statement alone.

Thanks,