cloudfoundry-community / firehose-to-syslog

Send firehose events from Cloud Foundry to syslog.
MIT License
44 stars 58 forks source link

Splunk logs are missing in flight. #195

Closed r2d2c3p0 closed 6 years ago

r2d2c3p0 commented 6 years ago

We are facing an issue where we are loosing around 30% logs in Splunk. cf top gave us ~35K events/second and we rounded that to 40K and going by Pivotal formula below we currently have 20 Doppler, 5 Traffic Controllers and 4 Nozzle instances deployed.

1 Doppler can process 2000 events/second , in my case (40,000 e/s) / (2000 e/s) = 20
Doppler instances = 4 x Traffic Controller
Traffic Controllers instances = Nozzle instances

Still, we are seeing logs getting dropped. Appreciate any input.

shinji62 commented 6 years ago

Hi, First which nozzle are you using ?? Splunk-to-syslog of Firehose-to-syslog ?

Splunk-to-syslog used to be similar to f2s but they diverge 2 years ago..

About your error most of the time this come from the ingestor. Meaning Splunk could not ingest the logs fast enough so f2f buffer until buffer are full.

Btw should be 5 nozzle as you have 5TC, but just to be sure you don't have any perf issue I will increase the number of nozzle, like 7-8.

I will advise you to check how much your splunk endpoint can ingest.

Thanks

r2d2c3p0 commented 6 years ago

Thank you for your response, we are using F2S and that's a typo we have 5 nozzles. When you say "check how much your Splunk endpoint can ingest", are you saying F2S or Splunk product? If it is Splunk, then increasing the Nozzle instances make it worse as they will be egressing more logs.

Thanks

shinji62 commented 6 years ago

"If it is Splunk, then increasing the Nozzle instances make it worse as they will be egressing more logs." Not sure to understand here ...

If you are loosing logs means that you want more logs to be ingest by splunk.

Buffering logs in F2S is limited.

r2d2c3p0 commented 6 years ago

Thanks Gwenn, you can close this thread.