openfaas / of-watchdog

Reverse proxy for STDIO and HTTP microservices
MIT License
262 stars 115 forks source link

Requesting alternative to setting maximum buffer size #152

Closed Kanshiroron closed 1 year ago

Kanshiroron commented 1 year ago

Expected Behaviour

The OpenFaas should either handle the long log line appropriately by logging in chunks equal to the buffer size or should ignore the long log line. Irrespective of the behavior, the function should continue to be responsive to subsequent requests.

Current Behaviour

When a function logs a line of above the log buffer size -- by default 64kb -- OpenFaas throws an error and subsequent function executions hang, causing timeouts on all executions until restart. I believe that this is caused by the piped stdout stream being full -- the Python process blocks because it can't write any more data and the Go process has ceased to read.

Possible Solution

Increasing the log buffer size, as enabled recently, doesn't work for us for several reasons:

This issue was previously raised as https://github.com/openfaas/of-watchdog/issues/100 and PR was raised that looks like it would fix the problem -- https://github.com/openfaas/of-watchdog/pull/107 -- but it wasn't merged.

alexellis commented 1 year ago

/set title: Requesting alternative to setting maximum buffer size

alexellis commented 1 year ago

Hi there @Kanshiroron

What exactly are you logging here, and why would you never know the maximum size?

The folks who ran into this issue previously have had it fixed by changing the buffer size.

I am not entirely clearly why you're writing such huge amounts of data to logs, and why setting a higher buffer size didn't work for you?

Alex

Kanshiroron commented 1 year ago

Hello @alexellis, thanks for the quick answer.

The issue is that the function calls out on external API which have answers that can significantly vary in size. And those answers are logged for debugging purposes.

To us, there are 2 issues:

Just wondering if there's an issue with the fix proposed by @LucasRoesler ? Any reason why not moving forward with it since that's gonna completely obliviate the issue, and remove the need from having a predefined buffer size?

alexellis commented 1 year ago

It sounds like an inappropriate use for logging.

If you have large payloads, you should store them in a database or in an S3 bucket, not misuse logging to dump massive amounts of payload data.

That's not within the spirit of the design of OpenFaaS.

alexellis commented 1 year ago

Here's an article on how to use S3 from OpenFaaS:

https://www.openfaas.com/blog/pdf-generation-at-scale-on-kubernetes/

And how to connect to a database:

https://levelup.gitconnected.com/building-a-todo-api-in-golang-with-kubernetes-1ec593f85029

Alternatively, you can bypass the watchdog and just use a Dockerfile and compatible health check:

https://docs.openfaas.com/reference/workloads/

Kanshiroron commented 1 year ago

@alexellis , Besides the fact that we are not really aligned on what is an appropriate logging usage, especially for debugging, the issue here is not that the log gets truncated but the whole pod stop functioning! And the only way to make it work again is to restart it. I do not understand how such a behavior does not alarm you for a "production grade" application.