Graylog2 / graylog-docker

Official Graylog Docker image
https://hub.docker.com/r/graylog/graylog/
Apache License 2.0
357 stars 132 forks source link

MongoDb 5 Docker failed to start after Upgrade #241

Closed HungryHowies closed 1 year ago

HungryHowies commented 1 year ago

All, I decided to upgrade Graylog but it requires MongoDb-5.0+. Right now I'm using MongoDb-4.4.18. I pulled new image MongoDb -5.0 adjusted my Docker-compose to use new image.

Error received

WARNING: MongoDB 5.0+ requires a CPU with AVX support, and your current system does not appear to have that!
  see https://jira.mongodb.org/browse/SERVER-54407
  see also https://www.mongodb.com/community/forums/t/mongodb-5-0-cpu-intel-g4650-compatibility/116610/2
  see also https://github.com/docker-library/mongo/issues/485#issuecomment-891991814

Not much I can do about CPU at this moment.

Docker-Compose

version: '3'
services:
   # MongoDB: https://hub.docker.com/_/mongo/
  mongodb:
   # Container time Zone
   #image: mongo:4.4.18
    image: mongo:5
    network_mode: bridge
   # DB in share for persistence
    volumes:
      - mongo_data:/data/db

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2-amd64
    # image: opensearchproject/opensearch:1.3.2
    network_mode: bridge
    #data folder in share for persistence
    volumes:
      - es_data:/usr/share/elasticsearch/data
    environment:
      - http.host=0.0.0.0
      - transport.host=localhost
      - network.host=0.0.0.0

      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    mem_limit: 1g

  graylog:
    #image: graylog/graylog-enterprise:4.3.3-jre11
    image: graylog/graylog-enterprise:4.3.9-jre11
    #image: graylog/graylog-enterprise:5.0.0
    network_mode: bridge
    dns:
      - 192.168.1.15
      - 192.168.1.16
   # journal and config directories in local NFS share for persistence
    volumes:
       - graylog_journal:/usr/share/graylog/data/journal
       # - graylog_bin:/usr/share/graylog/bin

Steps Executed

root@ansible:/usr/local/bin# docker-compose up -d
bin_elasticsearch_1 is up-to-date
Recreating bin_mongodb_1 ... done
bin_graylog_1 is up-to-date
root@ansible:/usr/local/bin#

No Mongo Container

root@ansible:/usr/local/bin# docker ps
CONTAINER ID   IMAGE                                                            COMMAND                  CREATED          STATUS                      PORTS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 NAMES
eb22a62cc20f   graylog/graylog-enterprise:4.3.9-jre11                           "tini -- /docker-ent…"   52 minutes ago   Up 51 minutes (unhealthy)   0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp, 0.0.0.0:5044->5044/tcp, :::5044->5044/tcp, 0.0.0.0:25->25/udp, :::25->25/udp, 0.0.0.0:5055->5055/tcp, :::5055->5055/tcp, 0.0.0.0:8443->8443/tcp, 0.0.0.0:5555->5555/udp, :::8443->8443/tcp, :::5555->5555/udp, 0.0.0.0:8514->8514/tcp, :::8514->8514/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp, 0.0.0.0:9300->9300/tcp, 0.0.0.0:8514->8514/udp, :::9300->9300/tcp, :::8514->8514/udp, 0.0.0.0:9515->9515/udp, :::9515->9515/udp, 0.0.0.0:9515->9515/tcp, :::9515->9515/tcp, 0.0.0.0:13301-13302->13301-13302/tcp, :::13301-13302->13301-13302/tcp, 0.0.0.0:12201->12201/udp, :::12201->12201/udp, 0.0.0.0:51420->51420/tcp, :::51420->51420/tcp, 0.0.0.0:49184->1281/tcp, :::49184->1281/tcp, 0.0.0.0:49183->1525/tcp, :::49183->1525/tcp   bin_graylog_1
faf618e2ca8c   docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2-amd64   "/tini -- /usr/local…"   52 minutes ago   Up 52 minutes               9200/tcp, 9300/tcp                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    bin_elasticsearch_1
root@ansible:/usr/local/bin#

Logs found

root@ansible:/usr/local/bin# docker-compose logs -f | grep mongo | more
graylog_1        | 2022-12-07 20:19:37,899 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:4, serverValue:4}] to mongo:27017
mongodb_1        |
mongodb_1        | WARNING: MongoDB 5.0+ requires a CPU with AVX support, and your current system does not appear to have that!
mongodb_1        |   see https://jira.mongodb.org/browse/SERVER-54407
mongodb_1        |   see also https://www.mongodb.com/community/forums/t/mongodb-5-0-cpu-intel-g4650-compatibility/116610/2
mongodb_1        |   see also https://github.com/docker-library/mongo/issues/485#issuecomment-891991814
mongodb_1        |
graylog_1        | 2022-12-07 20:19:41,517 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:5, serverValue:5}] to mongo:27017
graylog_1        | 2022-12-07 20:19:45,432 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:6, serverValue:6}] to mongo:27017
graylog_1        | 2022-12-07 20:19:45,438 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:7, serverValue:7}] to mongo:27017
graylog_1        | 2022-12-07 20:19:45,445 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:9, serverValue:9}] to mongo:27017
graylog_1        | 2022-12-07 20:19:45,452 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:8, serverValue:8}] to mongo:27017
graylog_1        | 2022-12-07 20:19:45,523 INFO : org.graylog2.periodical.Periodicals - Starting [org.graylog.plugins.auditlog.mongodb.MongoAuditLogPeriodical] periodical in [0s], polling e
very [3600s].
graylog_1        | 2022-12-07 20:19:45,532 INFO : org.mongodb.driver.connection - Opened connection [connectionId{localValue:10, serverValue:10}] to mongo:27017
graylog_1        | 2022-12-07 20:19:45,846 INFO : org.graylog2.lookup.LookupTableService - Data Adapter watchlist-mongo/627330d4e1c2a911d774918d [@4609788a] STARTING
graylog_1        | 2022-12-07 20:19:45,850 INFO : org.graylog2.lookup.LookupTableService - Data Adapter watchlist-mongo/627330d4e1c2a911d774918d [@4609788a] RUNNING
bin_mongodb_1 exited with code 132
graylog_1        | 2022-12-07 20:19:47,866 INFO : org.graylog2.lookup.LookupTableService - Starting lookup table watchlist/627330d4e1c2a911d7749191 [@1f7c51b6] using cache watchlist-cache/6
27330d4e1c2a911d774918f [@7c33192c], data adapter watchlist-mongo/627330d4e1c2a911d774918d [@4609788a]
graylog_1        | 2022-12-07 20:31:21,025 INFO : org.mongodb.driver.connection - Closed connection [connectionId{localValue:5, serverValue:5}] to mongo:27017 because there was a socket exc
eption raised on another connection from this pool.
graylog_1        | java.lang.RuntimeException: com.mongodb.MongoNodeIsRecoveringException: Command failed with error 11600 (InterruptedAtShutdown): 'interrupted at shutdown' on server mongo
:27017. The full response is {"ok": 0.0, "errmsg": "interrupted at shutdown", "code": 11600, "codeName": "InterruptedAtShutdown"}
graylog_1        | Caused by: com.mongodb.MongoNodeIsRecoveringException: Command failed with error 11600 (InterruptedAtShutdown): 'interrupted at shutdown' on server mongo:27017. The full
response is {"ok": 0.0, "errmsg": "interrupted at shutdown", "code": 11600, "codeName": "InterruptedAtShutdown"}

Any Advice would be appreciated .

Ubuntu 22.0.4 Virtual machine on Windows Hyper-v

Docker Version

root@ansible:/usr/local/bin# docker version
Client:
 Version:           20.10.12
 API version:       1.41
 Go version:        go1.17.3
 Git commit:        20.10.12-0ubuntu4
 Built:             Mon Mar  7 17:10:06 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.3
  Git commit:       20.10.12-0ubuntu4
  Built:            Mon Mar  7 15:57:50 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.5.9-0ubuntu3
  GitCommit:
 runc:
  Version:          1.1.0-0ubuntu1.1
  GitCommit:
 docker-init:
  Version:          0.19.0
  GitCommit:
root@ansible:/usr/local/bin#
bernd commented 1 year ago

@HungryHowies The MongoDB message indicates that your CPU doesn't support the AVX instructions.

You can check your hypervisor settings if it's possible to enable these instructions for the virtual machine.

WARNING: MongoDB 5.0+ requires a CPU with AVX support, and your current system does not appear to have that!
HungryHowies commented 1 year ago

@bernd Hey, Thank you for the reply, much appreciated. Yeah I did some digging. Thankfully this is only my lab Hyper-v servers. Glad I found this issue before we upgrade.

image

HungryHowies commented 1 year ago

@bernd My fix was... Unchecking the tic box image

But new errors occurred.

Dec  8 17:26:43 ansible dockerd[1254]: time="2022-12-08T17:26:43.701125233-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec  8 17:26:53 ansible dockerd[1254]: time="2022-12-08T17:26:53.821076778-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/mongodb_1"
Dec  8 17:27:04 ansible dockerd[1254]: time="2022-12-08T17:27:04.004540983-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec  8 17:27:14 ansible dockerd[1254]: time="2022-12-08T17:27:14.026510867-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec  8 17:27:24 ansible dockerd[1254]: time="2022-12-08T17:27:24.092746682-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec  8 17:27:34 ansible dockerd[1254]: time="2022-12-08T17:27:34.109101675-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec  8 17:27:44 ansible dockerd[1254]: time="2022-12-08T17:27:44.126879809-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/mongo"
Dec  8 17:27:53 ansible kernel: [ 2473.152844] traps: mongod[18239] trap invalid opcode ip:56311d63fa7a sp:7ffc21b4de10 error:0 in mongod[5631195ba000+51eb000]
Dec  8 17:27:54 ansible dockerd[1254]: time="2022-12-08T17:27:54.182350966-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
Dec  8 17:28:04 ansible dockerd[1254]: time="2022-12-08T17:28:04.426020497-06:00" level=warning msg="Health check for container 41c441ca3b149707002c2d3b5805af3c77c86c48ca43c2584ea1d607019f1c95 error: Cannot link to a non running container: /a90dc5802cd9_bin_mongodb_1 AS /bin_graylog_1/bin_mongodb_1"
root@ansible:/usr/local/bin#

Working on the resolve, but I'm starting to think that an upgrade might not be my solution, perhaps a fresh install.

HungryHowies commented 1 year ago

@bernd

I thought I resolved it, but its a "No Go". I aware of the correct cpu architecture type is needed. TBH this is the first time I upgraded software and the service would not start unless I have to correct CPU. Were still looking into. BTW I tried CentOS 7, Ubuntu 18,20,22. and the latest Docker/Docker-compose. Same out come.

https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX

bernd commented 1 year ago

@HungryHowies Does the CPU on your hypervisor support the AVX instructions?

The screenshot doesn't show the exact CPU model.

image

HungryHowies commented 1 year ago

@bernd

Sort answerer no it doesn't. This is unfortunate that the CPU's on our blade servers is prevent us to upgrade or use the newer software. True they may be a little old, also true Advanced Vector Extensions (AVX) are additions to the x86 instruction set architecture. Put simply, the additional instruction set allow compatible processors to perform more demanding functions when used with compatible software. So I am aware but for now our option at this point is compile Mongo OR replace all the CPU that is incompatible (i.e., this would be very expensive and time consuming) OR stay with old version OR move on.

@bernd Here is my Test GL Server specs. Notice we do have "sse " but not AVX.

[root@graylog graylog]# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
stepping        : 2
microcode       : 0xffffffff
cpu MHz         : 2400.083
cache size      : 12288 KB
physical id     : 0
siblings        : 6
core id         : 0
cpu cores       : 6
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl xtopology eagerfpu pni cx16 hypervisor lahf_lm ibrs ibpb spec_ctrl arch_capabilities
bogomips        : 4800.16
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual

I appreciate your replay, and thank you.