Requesting alternative to setting maximum buffer size

Kanshiroron commented 1 year ago

Expected Behaviour

The OpenFaas should either handle the long log line appropriately by logging in chunks equal to the buffer size or should ignore the long log line. Irrespective of the behavior, the function should continue to be responsive to subsequent requests.

Current Behaviour

When a function logs a line of above the log buffer size -- by default 64kb -- OpenFaas throws an error and subsequent function executions hang, causing timeouts on all executions until restart. I believe that this is caused by the piped stdout stream being full -- the Python process blocks because it can't write any more data and the Go process has ceased to read.

Possible Solution

Chunk the log line and read incrementally
Ignore log line and continue processing piped streams

Increasing the log buffer size, as enabled recently, doesn't work for us for several reasons:

we don't control what gets logged, so any increase could conceivably be insufficient, and any single log line that is too long results a system outage
the entire buffer is allocated on start up, so increasing the buffer size would increase our memory requirements
increasing the buffer size without other modifications won't work, since the default Linux pipe size is 64kb -- the of-watchdog code, as written, only reads entire lines, and the runtime process is blocked from writing the entire line

This issue was previously raised as https://github.com/openfaas/of-watchdog/issues/100 and PR was raised that looks like it would fix the problem -- https://github.com/openfaas/of-watchdog/pull/107 -- but it wasn't merged.

alexellis commented 1 year ago

/set title: Requesting alternative to setting maximum buffer size

alexellis commented 1 year ago

Hi there @Kanshiroron

What exactly are you logging here, and why would you never know the maximum size?

The folks who ran into this issue previously have had it fixed by changing the buffer size.

I am not entirely clearly why you're writing such huge amounts of data to logs, and why setting a higher buffer size didn't work for you?

Alex

Kanshiroron commented 1 year ago

Hello @alexellis, thanks for the quick answer.

The issue is that the function calls out on external API which have answers that can significantly vary in size. And those answers are logged for debugging purposes.

To us, there are 2 issues:

The function hangs when the buffer limit is reached, causing a timeout, and can't recover as stdout is locked. The only way to make is work again is to restart the pod.
Putting a huge buffer size is not an option as well since the buffer is fully allocated at the start of the function, drastically increasing the RAM usage.

Just wondering if there's an issue with the fix proposed by @LucasRoesler ? Any reason why not moving forward with it since that's gonna completely obliviate the issue, and remove the need from having a predefined buffer size?

alexellis commented 1 year ago

It sounds like an inappropriate use for logging.

If you have large payloads, you should store them in a database or in an S3 bucket, not misuse logging to dump massive amounts of payload data.

That's not within the spirit of the design of OpenFaaS.

alexellis commented 1 year ago

Here's an article on how to use S3 from OpenFaaS:

https://www.openfaas.com/blog/pdf-generation-at-scale-on-kubernetes/

And how to connect to a database:

https://levelup.gitconnected.com/building-a-todo-api-in-golang-with-kubernetes-1ec593f85029

Alternatively, you can bypass the watchdog and just use a Dockerfile and compatible health check:

https://docs.openfaas.com/reference/workloads/

Kanshiroron commented 1 year ago

@alexellis , Besides the fact that we are not really aligned on what is an appropriate logging usage, especially for debugging, the issue here is not that the log gets truncated but the whole pod stop functioning! And the only way to make it work again is to restart it. I do not understand how such a behavior does not alarm you for a "production grade" application.

openfaas / of-watchdog