Graylog2 / docker-compose

A set of Docker Compose files that allow you to quickly spin up a Graylog instance for testing or demo purposes.
Apache License 2.0
357 stars 134 forks source link

DataNode: Failed to load keystore from Mongo collection for node GRAYLOG CA #58

Closed chunned closed 5 months ago

chunned commented 5 months ago

I'm trying to run the Open-Core docker-compose.yml but running into an issue that prevents the DataNode from starting correctly. For the record, I have replicated this exact same configuration (i.e. same .env file) on AWS, which works successfully and functions as intended. I'm running into this issue on an Ubuntu VM hosted locally.

The Compose file is unchanged from this repo.

.env file (this is a test deployment so I'm not concerned about leaking these secrets):

GRAYLOG_PASSWORD_SECRET="8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92"
GRAYLOG_ROOT_PASSWORD_SHA2="5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8"    # raw password = 123456

DataNode logs:

2024-01-24T21:19:23.763Z ERROR [CustomCAX509TrustManager] Could not add Graylog CA to TrustManagers: Failed to load keystore from Mongo collection for node GRAYLOG CA
org.graylog.security.certutil.ca.exceptions.KeyStoreStorageException: Failed to load keystore from Mongo collection for node GRAYLOG CA
        at org.graylog.security.certutil.keystore.storage.KeystoreMongoStorage.readKeyStore(KeystoreMongoStorage.java:72) ~[graylog2-server-5.2.3.jar:?]
        at org.graylog.security.certutil.keystore.storage.SmartKeystoreStorage.readKeyStore(SmartKeystoreStorage.java:57) ~[graylog2-server-5.2.3.jar:?]
        at org.graylog.security.certutil.CaServiceImpl.loadKeyStore(CaServiceImpl.java:152) ~[graylog2-server-5.2.3.jar:?]
        at org.graylog2.security.CustomCAX509TrustManager.refresh(CustomCAX509TrustManager.java:62) [graylog2-server-5.2.3.jar:?]
        at org.graylog2.security.CustomCAX509TrustManager.<init>(CustomCAX509TrustManager.java:49) [graylog2-server-5.2.3.jar:?]
        at org.graylog2.security.CustomCAX509TrustManager$$FastClassByGuice$$da1d85.GUICE$TRAMPOLINE(<generated>) [?:?]
        at org.graylog2.security.CustomCAX509TrustManager$$FastClassByGuice$$da1d85.apply(<generated>) [?:?]
        at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:82) [guice-6.0.0.jar:?]
        at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:114) [guice-6.0.0.jar:?]
        at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91) [guice-6.0.0.jar:?]
        at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:300) [guice-6.0.0.jar:?]
        at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:60) [guice-6.0.0.jar:?]
        at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) [guice-6.0.0.jar:?]
        at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:169) [guice-6.0.0.jar:?]
        at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45) [guice-6.0.0.jar:?]
        at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:213) [guice-6.0.0.jar:?]
        at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:186) [guice-6.0.0.jar:?]
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:113) [guice-6.0.0.jar:?]
        at com.google.inject.Guice.createInjector(Guice.java:87) [guice-6.0.0.jar:?]
        at org.graylog2.shared.bindings.GuiceInjectorHolder.createInjector(GuiceInjectorHolder.java:34) [graylog2-server-5.2.3.jar:?]
        at org.graylog.datanode.bootstrap.CmdLineTool.setupInjector(CmdLineTool.java:441) [graylog-datanode.jar:?]
        at org.graylog.datanode.bootstrap.CmdLineTool.doRun(CmdLineTool.java:288) [graylog-datanode.jar:?]
        at org.graylog.datanode.bootstrap.CmdLineTool.run(CmdLineTool.java:244) [graylog-datanode.jar:?]
        at org.graylog.datanode.bootstrap.Main.main(Main.java:57) [graylog-datanode.jar:?]
Caused by: java.lang.IllegalArgumentException: Illegal base64 character 3f
        at java.util.Base64$Decoder.decode0(Unknown Source) ~[?:?]
        at java.util.Base64$Decoder.decode(Unknown Source) ~[?:?]
        at java.util.Base64$Decoder.decode(Unknown Source) ~[?:?]
        at org.graylog.security.certutil.keystore.storage.KeystoreMongoStorage.readKeyStore(KeystoreMongoStorage.java:67) ~[graylog2-server-5.2.3.jar:?]
        ... 23 more

I can provide the entire log if requested but I see nothing else relevant to the error.

I also replicated the same config on another local VM, which worked successfully, so now I'm even more confused.

ETA: To make things even weirder, I am now able to run Graylog on the original VM, but not from the original folder.

asherah@asherah:~$ md5sum graylog/docker-compose.yml graylog/.env    
1bc4ec7aeba13d21fbadace39ca3934b  graylog/docker-compose.yml
80cf7913062291b52a209c07d04487fc  graylog/.env
asherah@asherah:~$ md5sum graylog2/docker-compose.yml graylog2/.env
1bc4ec7aeba13d21fbadace39ca3934b  graylog2/docker-compose.yml
80cf7913062291b52a209c07d04487fc  graylog2/.env

To clarify, docker compose up in /graylog produces the error, and the same command in /graylog2 - which is using the exact same compose and .env files - works normally.

At this point, I've "fixed" the issue in the sense that I'm able to successfully run Graylog on the VM, but I am still confused about the original error.

janheise commented 5 months ago

@chunned Hi, thx. for raising that issue. I did not come across this problem yet but will make a note. Usually I'd recommend to start over in similar cases with a docker compose down -v because most of the time there was an issue with the config that got fixed in between steps but the old data needs to be purged.

It's working for you now, so I hope it's ok if I close this issue?

chunned commented 5 months ago

@janheise Hi, yes the issue can be closed

I did try to start over multiple times with docker compose down -v but the issue persisted. I also assumed there was an issue with the config but after comparing with other, working configs, I wasn't able to find any issues, so no idea what the actual problem was lol!