knative / func

Knative Functions client API and CLI
Apache License 2.0
283 stars 139 forks source link

Do not default health endpoints #1839

Closed matejvasek closed 9 months ago

matejvasek commented 1 year ago

This is opposite of https://github.com/knative/func/issues/1281.

The defaulting means user cannot unset health endpoints in any way.

The endpoints should be set explicitly in templates. If the endpoints are not explicitly set in func.yaml then knative service should not have health endpoint set.

matejvasek commented 1 year ago

/cc @lkingland

matejvasek commented 1 year ago

cc @mvinkler

lance commented 1 year ago

If the endpoints are not explicitly set in func.yaml then knative service should not have health endpoint set.

What's the reasoning here? Are you also saying that the frameworks, e.g. faas-js-runtime disable these endpoints if the value is not found in func.yaml?

matejvasek commented 1 year ago

@lance What if user doesn't have endpoints, for instance when trying to migrate existing project to use func? IMO user should have ability to not have endpoints.

matejvasek commented 1 year ago

Are you also saying that the frameworks, e.g. faas-js-runtime disable these endpoints if the value is not found in func.yaml?

Nope. The application still may have health endpoints, but knative serving would not be checking them if not set in func.yaml.

matejvasek commented 1 year ago

This has been discovered by @mvinkler . He tried to port some existing Quarkus code to use func but it did not work because of missing endpoints.

matejvasek commented 1 year ago

Defaulting: + Three fewer lines in func.yaml. - There is absolutely no way to disable health probes should user want it. No Defaulting: + User can disable health probes should they need it. - There are three more lines in func.yaml.

mvinkler commented 1 year ago

Thanks Matej for filing this issue.

Reasoning is the following: If user decides to use his existing Quarkus app and includes func.yaml in the project (with default values), health checks will be the point of failure for the app deployment. When the ksvc is created on the cluster, functions project "magically" adds the health checks for readiness and liveness and it doesn't care, if there is smallrye extension with proper paths in application.properties added in the project.

Compare with the default behavior of kn service create: it adds the following "minimal" health check:

readinessProbe:
  successThreshold: 1
  tcpSocket:
    port: 0

User has no way of knowing that there is some default value applied on Functions side and he actually cannot in any way use the mentioned minimal health check.

Addition to Matej's summary: Defaulting: − Not consistent with kn service create − User has no way of knowing what happens, where the health checks come from, how to change them. − No health checks description in Functions documentation.

No Defaulting: + The explicitly set health checks are self-explanatory. + No magic creating health checks on application deploy.

mvinkler commented 1 year ago

I hit this using Quarkus, I have no idea if other runtimes also have some extensions similar to Smallrye and the same behavior applies to them.

lance commented 1 year ago

@mvinkler @matejvasek thanks for the explanation - that all makes sense

lkingland commented 1 year ago

Thank you for considering the situation, opening the topic, and providing a thread of pros/cons!

This is both a feature request for the Quarkus Functions Runtime and a request for better func.yaml documentation

[we] tried to port some existing Quarkus code to use func but it did not work because of missing endpoints.

Implementing these endpoints is the responsibility of the Functions middleware (runtime) and an area of active development (see the Scaffolding Architectural Discussion and Go MVC Epic). If the Function author has implemented health endpoints in their Function, those should be used. If they did not implement the endpoints, the middleware will answer those readiness and liveness checks. In addition behavior should be documented, easy to override, and able to be turned off entirely. This is clearly superior to a hard-coded TCP probe in a configuration file on project creation for several reasons, and a point in favor of Functions as a "value add" over standard services.

This issue brings up a few topics worth covering in this week's call: 1) we need better func.yaml documentation and 2) we need to more strictly interpret func.yaml to enable full flexibility and help avoid confusion.

@lance What if user doesn't have endpoints, for instance when trying to migrate existing project to use func?

Then the middleware will implement it, to the best of its ability, entirely transparently to the Function developer.

+ (Defaulting) Three fewer lines in func.yaml.

Perhaps it's worth reiterating the purpose of the in-code defaults:

By hard-coding the values into the func.yaml when a Function is created, we are inhibiting the user's ability to migrate in the future, and our ability to respond quickly to everything from platform improvements to CVEs. Updating func (library or binary) updates defaults, allowing the infrastructure supporting a user's Function to grow, evolve, and patch itself intelligently (see: our burgeoning conversation around how to implement a Function's operator). Updating to the latest func version must treat the user's func.yaml as sacrosanct. There is no way to know if a value written to func.yaml is simply the default written at Function craetion time, or a value intentionally added by the author. Therefore any value there is effectively immutable.

Let's consider the next point as an example:

- (Defaulting) There is absolutely no way to disable health probes should user want it

If we rememeber that func.yaml is, strictly interpreted, overrides to default behavior, then setting the values to empty would disable the endpoints entirely. However this makes little sense in practice, presuming the middleware works correctly.

− Not consistent with kn service create

See above; an intelligently implemented default endpoint is better than a TCP probe, which is likewise better than nothing. This is a value-add of Functions.

− User has no way of knowing what happens, where the health checks come from, how to change them.

This is a documentation issue, which can be partially addressed with a commented block of config in func.yaml showing the syntax and current default values. This is idiomatic of config files and we should probably do the same.

− No health checks description in Functions documentation.

This is likewise a call for docs.

+ The explicitly set health checks [in func.yaml would be] self-explanatory.

This is a documentation issue; not a point either for or against setting defaults in code vs in the config file.

+ No magic creating health checks on application deploy.

This is likewise a documentation issue.

Countless operations happen outside the purview of a service author. What causes justified angst (and levying the pejorative "magic") is when complexity that the author needs to know is hidden. I empathize, and emphatically agree that systems should not include this magic. The root of the problem here is that the Function needed to know. These endpoints should be entirely optional. Ignoring their implementation should not cause the function to fail to deploy. Behavior should be well documented, and it obvious how an author can override them with their implementation.

So I hope we can agree that 1) health endpoints should be optional 2) should have better documentation, including how they can be modified 3) The evaluation of func.yaml should be strict to enable complete control by the Function author, including disabling features such as health checks and 4) only user-provided overrides should be written into func.yaml. This latter point is really quite essential.

Thank you very much for the feedback

mvinkler commented 1 year ago

@lkingland thanks for your comprehensive summary!

This is a simple misunderstanding about the purpose of func.yaml. It should be strictly interpreted as overrides to default middleware behavior. (from one of the previous versions of your comment)

I wasn't aware of this and it makes perfect sense to treat func.yaml this way + make sure there is a proper documentation for it.

So I hope we can agree that 1) health endpoints should be optional 2) should have better documentation, including how they can be modified 3) The evaluation of func.yaml should be strict to enable complete control by the Function author, including disabling features such as health checks and 4) only user-provided overrides should be written into func.yaml. This latter point is really quite essential.

I do agree with all of these points. Yet I am still missing some easy way for the user to know what values are going to be used for their app deployment. Maybe the verbose output for kn func deploy could also contain some summary of func.yaml parsing? I.e. each default value, which has not been overwritten in func.yaml, would be written there. This way the user can easily see which specific health checks were applied for example.

lkingland commented 1 year ago

Thanks @mvinkler. In our most recent meeting we discussed using a template for the func.yaml rather than a "blind" marshalling, which will have all possible settings enumerated (but commented out) with their static defaults shown in the comment. If the user has overridden any of the values, the comment will be omitted in favor of the user-supplied setting.

This method is not without its annoyances (for example, any manual edits to the comments will not be retained), but this should allow one to easily scan through a func.yaml to see what options are available, and uncomment the lines to provide customized overrides.

This will be a good deal of manual work, but its documentation we need to have!

I also agree we should display the effective values somehow. Either via an echo during kn func deploy, a much-expanded kn func status, or both.

Let's leave this issue open until it is superceded by that task/issue directly 👍🏻

acelinkio commented 1 year ago

My thoughts on this issue are that having an opinionated implementation is perfectly fine as long as its documented. Pushing people towards best practices is much better than letting people have absolute freedom. (Rust borrow checker vs C entirely unrestrained)

I am not sure if there is an open standard between the different function providers. OpenFaaS, AWS Lambda, etc.

github-actions[bot] commented 10 months ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.