OpenFunction / functions-framework

functions-framework for multi-runtime, multi-type functions, and multi-language support
10 stars 6 forks source link

Add observability capabilities to functions #9

Closed tpiperatgod closed 2 years ago

tpiperatgod commented 2 years ago

We need to add observability capabilities to functions, which facilitate observing and tracking the operation of functions in large-scale scenarios.

Referring to #7, we can take the form of a plugin in functions-framework to wake up the observability component to run at a specific node.

For example, in functions-framework-go, we can add a plugin hook before and after the function is run, and run the logic related to the observability component in the hook.

Reference this:

func registerHTTPFunction(path string, fn func(http.ResponseWriter, *http.Request), h *http.ServeMux) error {
    h.HandleFunc(path, func(w http.ResponseWriter, r *http.Request) {
        defer recoverPanicHTTP(w, "Function panic")
        // execute pre-run plugins
        fn(w, r)
        // execute post-run plugins
    })
    return nil
}

We should also consider as much as possible the consistency of the scheme's implementation in different languages.

We can complete the details of the design of this solution in this document.

benjaminhuo commented 2 years ago

We might need to adjust all sync function and async function signature to the same as below, this way we can put tracing options such as skywalking, opentelemetry or none into OpenFunctionContext.

And then we can use these tracing parameters in OpenFunctionContext to create a wrapper function to wrap user function with tracing ability.

The tracing options skywalking, opentelemetry or none can be put into function crd maybe.

What do you think? @tpiperatgod @wanjunlei @FeynmanZhou @arugal :

func func1(ctx *ofctx.OpenFunctionContext, in []byte) ofctx.RetValue 
tpiperatgod commented 2 years ago

I think it would make sense to use the OpenFunctionContext to pass tracing options to the functions-framework, which is the job the OpenFunctionContext should take on.

And I agree with putting the options about function tracing in function crd.

wu-sheng commented 2 years ago

I want to give a heads up to the OpenFunction team. I am going to put a core-level proposal to SkyWalking project, which means we are going to officially move to SkyWalking v9. About the immigration part, OpenFunction project doesn't need to worry about the break, because we are going to do that. All agents, go2sky, nodejs and python, are still as same as before, v3 tracing protocol will not be changed.

The thing I want to mention is, a new concept is going to be added in v9 core, which is layer. I suggest to add layer=faas as a specific tag into the root span of segment, which would help SkyWalking to ship the logic service and endpoint into FAAS page.

More information will be share next week or this weekend. Once the 8.9.0 release(In releasing process) is done, the new proposal will be out.

wu-sheng commented 2 years ago

Besides the APIs you are discussing, we also should consider

  1. Shipping logs, some agents(SkyWalking, not OpenTelemetry) have bundled channel to forward this directly, rather than collecting logs from K8s or files.
  2. Manual instrumentation Metrics APIs. There should be some kinds Prometheus concepts, but more closing to a metric API rather than implementation.
  3. Tracing part, beside before/after/context as mentioned, we should provide at least manually tagging APIs to add more custom information when needed.
benjaminhuo commented 2 years ago

Besides the APIs you are discussing, we also should consider

  1. Shipping logs, some agents(SkyWalking, not OpenTelemetry) have bundled channel to forward this directly, rather than collecting logs from K8s or files.
  2. Manual instrumentation Metrics APIs. There should be some kinds Prometheus concepts, but more closing to a metric API rather than implementation.
  3. Tracing part, beside before/after/context as mentioned, we should provide at least manually tagging APIs to add more custom information when needed.

Thanks a lot for these suggestions @wu-sheng! we'll think about these points

benjaminhuo commented 2 years ago

I want to give a heads up to the OpenFunction team. I am going to put a core-level proposal to SkyWalking project, which means we are going to officially move to SkyWalking v9. About the immigration part, OpenFunction project doesn't need to worry about the break, because we are going to do that. All agents, go2sky, nodejs and python, are still as same as before, v3 tracing protocol will not be changed.

The thing I want to mention is, a new concept is going to be added in v9 core, which is layer. I suggest to add layer=faas as a specific tag into the root span of segment, which would help SkyWalking to ship the logic service and endpoint into FAAS page.

More information will be share next week or this weekend. Once the 8.9.0 release(In releasing process) is done, the new proposal will be out.

Great to have this info! Sure, we'll add layer=faas tag to the root span

benjaminhuo commented 2 years ago

I've created an initial proposal for tracing : https://hackmd.io/@UrcJbEg9R_mxQy4aRXO5tA/H1A4vDe9K @wu-sheng @arugal @webup @tpiperatgod @wanjunlei @FeynmanZhou

wu-sheng commented 2 years ago

I think we needs to provide tags(in the proposal) for users, and also consider Correlation context. OpenTracing(OpenTelemetry should have too) has a same concept called baggage.

Also, to @arugal , we should consider how to add timestamp of previous function end, to propagate through sw8-x. Then SkyWalking server could have the scheduling latency from functionA to functionB.

benjaminhuo commented 2 years ago

Correlation context

I think we needs to provide tags(in the proposal) for users, and also consider Correlation context. OpenTracing(OpenTelemetry should have too) has a same concept called baggage.

Also, to @arugal , we should consider how to add timestamp of previous function end, to propagate through sw8-x. Then SkyWalking server could have the scheduling latency from functionA to functionB.

There is customTags section to add tags a user want to add, change it to tags ?

      customTags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
wu-sheng commented 2 years ago

I think tags work.

benjaminhuo commented 2 years ago

Changed customTags to tags already

benjaminhuo commented 2 years ago

Added baggage like below:

apiVersion: core.openfunction.io/v1alpha2
kind: Function
metadata:
  name: function-with-tracing
spec:
  serving:
    runtime: "OpenFuncAsync"
    tracing:
      # Switch to tracing, default to false
      enabled: true
      # Provider name can be set to "skywalking", "opentelemetry"
      # A valid provider must be set if tracing is enabled.
      provider: 
        name: "skywalking"
      # Custom tags to add to tracing
      tags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
      baggage:
      # baggage key is `sw8-correlation` for skywalking and `baggage` for opentelemetry
      # Correlation context for skywalking: https://skywalking.apache.org/docs/main/latest/en/protocols/skywalking-cross-process-correlation-headers-protocol-v1/
      # baggage for opentelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md
      # W3C Baggage Specification/: https://w3c.github.io/baggage/
        key: sw8-correlation # key should be baggage for opentelemetry
        value: "base64(string key):base64(string value),base64(string key2):base64(string value2)"
wu-sheng commented 2 years ago

Make sense. Good to me.

benjaminhuo commented 2 years ago

Proposal for plugin mechanism to function framework by @tpiperatgod https://hackmd.io/O8o01-mjT6uv6L9F25pYsA?view=

wu-sheng commented 2 years ago

@arugal https://github.com/apache/skywalking/pull/8367 SkyWalking v9 core upgrade is almost done.

Do we have any update about OpenFunction side? As we are going to add tracing(go2sky) to it first, @arugal you need to follow this v9 update, and we need to make sure OpenFunction's trace could be identified as a faas layer service, instance.

Also, we need a definition about what are the service and instance in the OpenFunction or general FAAS scope. @benjaminhuo @tpiperatgod Any suggestion about this?

benjaminhuo commented 2 years ago

@wu-sheng,Thanks very much for the reminder.

To add skywalking tracing we need to refactor functions-framework and add a plugin mechanism and the design is almost finished.

With the previous tracing proposal and this design, the skywalking tracing function is now our current most important work to do.

Once the coding of the plugin mechanism is finished, we'll need @arugal's help to add skywalking tracing code as a plugin.

From my understanding, a function is a service and its replica is an instance. Do we need to add the service and instance tag to skywalking tracing?

wu-sheng commented 2 years ago

SkyWalking has service and instance fields directly(not need tags) to declare that. The reason I am asking for this, usually, an FAAS level function seems(from my little FAAS understanding, please CMIIW) more closing to an endpoint concept in SkyWalking.

So, I just recheck, whether OpenFunction has a higher level concept for a group of function replica(instance) grouped as a unit or something. There is no issue for function as service, it is just if we are using like this, the SkyWalking's endpoint concept seems not very useful for OpenFunction case. Or do I miss anything in the OpenFunction could be defined as a subset of function to be a function.

benjaminhuo commented 2 years ago

OpenFunction has sync functions and it can be accessed through HTTP, endpoint could be valuable for sync functions. Regarding async functions, it's triggered by events from middleware like MQ and maybe it's not applicable here. We'll take a look at skywalking's Service/Instance/Endpoint concept to find out how to integrate with it.

wu-sheng commented 2 years ago

Sync and async all work in SkyWalking. We have Kafka consumer or async scheduled task in SkyWalking is defined as an endpoint. My question is more focusing on, should we have endpoint still works in OpenFunction, as here, Function is the executable unit. Do we have larger concept for service?

benjaminhuo commented 2 years ago

Got you, we'll add serviceless workflow capability and it's a set of related functions, so maybe a serverless workflow is a skywalking servcie

wu-sheng commented 2 years ago

Is a workflow always running in one process(OS level)? Because service-to-service is better to measure network performance comparing to endpoint-to-endpoint in today's SkyWalking.

benjaminhuo commented 2 years ago

A workflow itself will run in different processes (functions) actually.

arugal commented 2 years ago

To add skywalking tracing we need to refactor functions-framework and add a plugin mechanism and the design is almost finished.

With the previous tracing proposal and this design, the skywalking tracing function is now our current most important work to do. Once the coding of the plugin mechanism is finished, we'll need @arugal's help to add skywalking tracing code as a plugin.

Good to me, I'll start after the framework is complete :)

wu-sheng commented 2 years ago

A workflow itself will run in different processes (functions) actually.

OK, then, we need to consider more how to define service in OpenFunction. Let's set each function as service for now as a PoC version.

benjaminhuo commented 2 years ago

Sure, the refactoring of functions framework is almost done. Maybe we can start the integration with Skywalking the first community meeting of 2022(Jan 6th) @tpiperatgod @arugal

A workflow itself will run in different processes (functions) actually.

OK, then, we need to consider more how to define service in OpenFunction. Let's set each function as service for now as a PoC version.

wu-sheng commented 2 years ago

Sure, the refactoring of functions framework is almost done. Maybe we can start the integration with Skywalking the first community meeting of 2022(Jan 6th)

This seems good. SkyWalking's first release plans on March. For developers, new backend core should be available in the first week of Jan., and the first draft will be around Chinese New Year. @Fine0830 Do you have a solid timeline for booster UI?

Fine0830 commented 2 years ago

Sure, the refactoring of functions framework is almost done. Maybe we can start the integration with Skywalking the first community meeting of 2022(Jan 6th)

This seems good. SkyWalking's first release plans on March. For developers, new backend core should be available in the first week of Jan., and the first draft will be around Chinese New Year. @Fine0830 Do you have a solid timeline for booster UI?

Uh...I don't sure about the timeline. Probably March is okay for me.

wu-sheng commented 2 years ago

Uh...I don't sure about the timeline. Probably March is okay for me.

OK, let's see. Anyway, I think OpenFunction will move faster than SkyWalking itself :)

benjaminhuo commented 2 years ago

Now the functions-framework refactoring proposal is ready: https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_functions_framework_refactoring.md

Skywalking tracing will be implemented as a plugin of the above functions-framework https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_support_function_tracing.md

tpiperatgod commented 2 years ago

Skywalking tracing will be implemented as a plugin of the above functions-framework https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_support_function_tracing.md

On the basis of this proposal, how about setting the configuration of the plugin section to this?

apiVersion: core.openfunction.io/v1alpha2
kind: Function
metadata:
  name: function-with-tracing
  annotations:
    plugins.pre:
      - pluginA
      - pluginB
      - pluginC
    plugins.post:
      - pluginC
      - pluginA
    plugins.tracing: |
      # Switch for tracing, default to false
      enabled: true
      # Provider name can be set to "skywalking", "opentelemetry"
      # A valid provider must be set if tracing is enabled.
      provider:
        name: "skywalking"
        oapServer: "localhost:xxx"
      # Custom tags to add to tracing
      tags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
      baggage:
      # baggage key is `sw8-correlation` for skywalking and `baggage` for opentelemetry
      # Correlation context for skywalking: https://skywalking.apache.org/docs/main/latest/en/protocols/skywalking-cross-process-correlation-headers-protocol-v1/
      # baggage for opentelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md
      # W3C Baggage Specification/: https://w3c.github.io/baggage/
        key: sw8-correlation # key should be baggage for opentelemetry
        value: "base64(string key):base64(string value),base64(string key2):base64(string value2)"
benjaminhuo commented 2 years ago

Skywalking tracing will be implemented as a plugin of the above functions-framework https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_support_function_tracing.md

On the basis of this proposal, how about setting the configuration of the plugin section to this?

apiVersion: core.openfunction.io/v1alpha2
kind: Function
metadata:
  name: function-with-tracing
  annotations:
    plugins.pre:
      - pluginA
      - pluginB
      - pluginC
    plugins.post:
      - pluginC
      - pluginA
    plugins.tracing: |
      # Switch for tracing, default to false
      enabled: true
      # Provider name can be set to "skywalking", "opentelemetry"
      # A valid provider must be set if tracing is enabled.
      provider:
        name: "skywalking"
        oapServer: "localhost:xxx"
      # Custom tags to add to tracing
      tags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
      baggage:
      # baggage key is `sw8-correlation` for skywalking and `baggage` for opentelemetry
      # Correlation context for skywalking: https://skywalking.apache.org/docs/main/latest/en/protocols/skywalking-cross-process-correlation-headers-protocol-v1/
      # baggage for opentelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md
      # W3C Baggage Specification/: https://w3c.github.io/baggage/
        key: sw8-correlation # key should be baggage for opentelemetry
        value: "base64(string key):base64(string value),base64(string key2):base64(string value2)"

Annotation is just a map[string]string, make sure you know how to fit the data you want to this data structure

wu-sheng commented 2 years ago

As annotation is just a map[string]string, this seems the value(such as key=plugins.tracing) has to be parsed by tracer implementation. I am not sure what does OpenFunction recommends? Do you prefer key/value pairs or proto-obj oriented like Envoy?

benjaminhuo commented 2 years ago

@wu-sheng Yes, the value of the key has to be parsed by OpenFunction itself before passing it to skywalking. No need for skywalking to parse it in my opinion.

tpiperatgod commented 2 years ago

My mistake. It has been adjusted to the following format:

apiVersion: core.openfunction.io/v1alpha2
kind: Function
metadata:
  name: function-with-tracing
  annotations:
    plugins: |
      # Default order option. During the preHooks phase the plugins will be executed in the following order:
      #   pluginA -> pluginB -> pluginC
      # In the postHooks phase the plugins will be executed in the following order:
      #   pluginC -> pluginB -> pluginA
      order:
      - pluginA
      - pluginB
      - pluginC
      # The "pre" and "post" options will override the order in the "order" option,
      # and you can specify the order of execution of the plugins in the prehooks and posthooks phases separately
      pre:
      - pluginA
      - pluginC
      - pluginB
      post:
      - pluginB
      - pluginA
    plugins.tracing: |
      # Switch for tracing, default to false
      enabled: true
      # Provider name can be set to "skywalking", "opentelemetry"
      # A valid provider must be set if tracing is enabled.
      provider:
        name: "skywalking"
        oapServer: "localhost:xxx"
      # Custom tags to add to tracing
      tags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
      baggage:
      # baggage key is `sw8-correlation` for skywalking and `baggage` for opentelemetry
      # Correlation context for skywalking: https://skywalking.apache.org/docs/main/latest/en/protocols/skywalking-cross-process-correlation-headers-protocol-v1/
      # baggage for opentelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md
      # W3C Baggage Specification/: https://w3c.github.io/baggage/
        key: sw8-correlation # key should be baggage for opentelemetry
        value: "base64(string key):base64(string value),base64(string key2):base64(string value2)"
wu-sheng commented 2 years ago

So, there will be an object to define the configuration in the OpenFunction codebase, carrying the parsed configurations. Then SkyWalking tracer accepts it and sets it to the go2sky kernal.

benjaminhuo commented 2 years ago

So, there will be an object to define the configuration in the OpenFunction codebase, carrying the parsed configurations. Then SkyWalking tracer accepts it and sets it to the go2sky kernel.

Yes, that's correct!

benjaminhuo commented 2 years ago

@wu-sheng @arugal OpenFunction v0.6.0-rc.0 has been released and now SkyWalking has a perfect integration with OpenFunction Async and Sync functions! Thanks, @arugal for the huge effort on this integration!

Skywalking tracing can be enabled either as a global option or as a per-function option as described in https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_support_function_tracing.md

wu-sheng commented 2 years ago

Fantastic! We are going to prepare the v9 release in the next 2 weeks, I will ask @arugal to set up the FAAS dashboard for you. This dashboard will be included as a default active function(on the top-level menu), I think you would love that.

I will update after we have that.

benjaminhuo commented 2 years ago

Looking forward to SkyWalking v9!