datalust / helm.datalust.co

Helm charts hosted on helm.datalust.co
Apache License 2.0
10 stars 16 forks source link

Chart assumes default cluster DNS name for GELF container #16

Closed jmezach closed 2 years ago

jmezach commented 3 years ago

I've been trying to set up Seq in our Kubernetes cluster with GELF input coming from Fluentbit based on this documentation. I have deployed Seq using the Helm chart as described in the documentation. While setting up Fluentbit I've seen that our cluster isn't using the default DNS name of *.cluster.local but is using a different name instead, so I had to configure Fluentbit slightly different. This could be because our cluster has been rolled out using Rancher.

After fixing that though, I still didn't see any logs. So I had a look at the logs of the seq-gelf container inside my Seq pod and I saw the following error message:

Failed to send an event batch
System.Net.Http.HttpRequestException: Name or service not known
 ---> System.Net.Sockets.SocketException (0xFFFDFFFF): Name or service not known
   at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
   at SeqCli.Ingestion.LogShipper.SendBatchAsync(SeqConnection connection, String apiKey, IReadOnlyCollection`1 batch, Boolean logSendFailures) in /home/appveyor/projects/seqcli/src/SeqCli/Ingestion/LogShipper.cs:line 155
   at SeqCli.Ingestion.LogShipper.ShipEvents(SeqConnection connection, String apiKey, ILogEventReader reader, InvalidDataHandling invalidDataHandling, SendFailureHandling sendFailureHandling, Func`2 filter) in /home/appveyor/projects/seqcli/src/SeqCli/Ingestion/LogShipper.cs:line 55

After inspecting the pod configuration I noticed that the seq-gelf container had an environment variable SEQ_ADDRESS set to an address that ended with *.cluster.local as well, which caused the above error message.

Unfortunately there isn't a way to override this from the Helm chart. I had to manually patch the deployment to change the value of that environment variable. Interestingly I don't think it needs to be a FQDN at all, since the seq-gelf container is running within the same pod, so it can just use localhost and it will work. I've manually made that change on my cluster and now I'm seeing logs coming into Seq.

nblumhardt commented 3 years ago

Thanks for the suggestion @jmezach - we'll try to get this into a future release 👍

jmezach commented 2 years ago

@nblumhardt Any idea when this will happen? We're trying to operationalise our Kubernetes cluster with Seq deployed using Helm, but I'm still running into this. Would you accept a PR for this?

KodrAus commented 2 years ago

@jmezach Hey! :wave: We'd gladly accept PRs if you're keen to dig into it. So the change we'd need to make will be to either make SEQ_ADDRESS configurable, or always use localhost? Will localhost always be valid since they're in the same pod? I'm not a heavy Kubernetes user myself so I think your understanding of this is probably better than mine.

jmezach commented 2 years ago

According to this document containers within the same Pod should always be able to access each other using localhost, so yes, as long as both Seq and the Seq Gelf Input containers are within the same pod we should be able to just use localhost. I'll send in a PR momentarily.