openfaas-incubator / vcenter-connector

Extend vCenter with OpenFaaS
MIT License
28 stars 8 forks source link

Switch to latest connector-SDK release #31

Closed embano1 closed 4 years ago

embano1 commented 4 years ago

Description

The connector now leverages the latest connector-sdk features such as allowing multiple event topic subscriptions (delimited with ",") and printing invokation responses via the controller.Subscribe interface (implemented by events.NewEventReceiver().

The controller uses a 5 second RebuildInterval (sync function subscriptions) and 3 second UpstreamTimeout for invoking functions to not block too long.

Function comments are line-wrapped. Updates to Gopkg.toml to use the latest releases for imported packages:

Fixes #33

Motivation and Context

How Has This Been Tested?

[1]

version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  of-echo:
    lang: python
    handler: ./echo
    image: embano1/of-echo:latest
    environment:
      write_debug: true
      read_debuge: true
    annotations:
      topic: "vm.powered.on,vm.powered.off,vm.stopping"

Output of vcenter-connector:

2019/12/02 17:55:22 Syncing topic map
2019/12/02 17:54:15 Message on topic: vm.powered.on
2019/12/02 17:54:15 Invoke function: of-echo.openfaas-fn
[...]
2019/12/02 17:54:15 successfully invoked function of-echo.openfaas-fn for topic vm.powered.on
[...]
2019/12/02 17:55:13 Message on topic: vm.powered.off
2019/12/02 17:55:13 Invoke function: of-echo.openfaas-fn
2019/12/02 17:55:13 successfully invoked function of-echo.openfaas-fn for topic vm.powered.off
[...]
2019/12/02 17:55:13 Message on topic: vm.stopping
2019/12/02 17:55:13 Invoke function: of-echo.openfaas-fn
[...]
2019/12/02 17:55:13 successfully invoked function of-echo.openfaas-fn for topic vm.stopping
2019/12/02 17:55:22 Syncing topic map

Output of of-echo:

2019-12-02T16:54:15Z 2019/12/02 16:54:15 Forking fprocess.
2019-12-02T16:54:15Z 2019/12/02 16:54:15 Query
2019-12-02T16:54:15Z 2019/12/02 16:54:15 Path  /
2019-12-02T16:54:15Z 2019/12/02 16:54:15 Duration: 0.064636 seconds
2019-12-02T16:54:15Z {"topic":"vm.powered.on","category":"info","source":"10.0.0.1","userName":"VSPHERE.LOCAL\\Administrator","createdTime":"2019-12-02T16:54:14.066386Z","objectName":"vm-01","managedObjectReference":{"Type":"VirtualMachine","Value":"vm-56"}}
2019-12-02T16:54:15Z ok
2019-12-02T16:55:13Z 2019/12/02 16:55:13 Forking fprocess.
2019-12-02T16:55:13Z 2019/12/02 16:55:13 Query
2019-12-02T16:55:13Z 2019/12/02 16:55:13 Path  /
2019-12-02T16:55:14Z {"topic":"vm.powered.off","category":"info","source":"10.0.0.1","userName":"VSPHERE.LOCAL\\Administrator","createdTime":"2019-12-02T16:55:12.958539Z","objectName":"vm-01","managedObjectReference":{"Type":"VirtualMachine","Value":"vm-56"}}
2019-12-02T16:55:14Z ok
2019-12-02T16:55:14Z 2019/12/02 16:55:14 Duration: 0.062800 seconds
2019-12-02T16:55:14Z 2019/12/02 16:55:14 Forking fprocess.
2019-12-02T16:55:14Z 2019/12/02 16:55:14 Query
2019-12-02T16:55:14Z 2019/12/02 16:55:14 Path  /
2019-12-02T16:55:14Z 2019/12/02 16:55:14 Duration: 0.057169 seconds
2019-12-02T16:55:14Z {"topic":"vm.stopping","category":"info","source":"10.0.0.1","userName":"VSPHERE.LOCAL\\Administrator","createdTime":"2019-12-02T16:55:12.6641Z","objectName":"vm-01","managedObjectReference":{"Type":"VirtualMachine","Value":"vm-56"}}
2019-12-02T16:55:14Z ok

Types of changes

Checklist:

Signed-off-by: Michael Gasch mgasch@vmware.com

alexellis commented 4 years ago

I apologise for my comment about using the PR template. It seems that we forgot to add it to the repo.

Please can you reformat this PR description? The most important part is this question

How has this been tested?

https://github.com/openfaas/faas/blob/master/.github/PULL_REQUEST_TEMPLATE.md

Thanks for the PR

alexellis commented 4 years ago
and 3 second UpstreamTimeout for invoking functions to
not block too long.

This seems a bit too short. The default for functions in the watchdog is 10s, I would suggest maybe having that to match the default?

alexellis commented 4 years ago

What do you think to using the async invocation mode in the connector-sdk, I'm pretty sure that was added at some point?

alexellis commented 4 years ago

How did you decide on 5 second resync period, what if this was made configurable instead?

Add a helm chart maybe? I covered creating these in my latest tutorial/lab - https://github.com/alexellis/helm3-expressjs-tutorial

alexellis commented 4 years ago

https://github.com/openfaas-incubator/openfaas-vcenter-connector/issues/32

alexellis commented 4 years ago

Async -> https://github.com/openfaas-incubator/connector-sdk/blob/752212ce352a4db4432bd24fa1101eed5bfba5a1/types/controller.go#L182

embano1 commented 4 years ago

Thx @alexellis for your quick review! Updated the PR with the template and will that one going forward!

This seems a bit too short. The default for functions in the watchdog is 10s, I would suggest maybe having that to match the default?

No prob, will reset and update the docs accordingly.

How did you decide on 5 second resync period, what if this was made configurable instead?

1s (default) seemed rather short (was afraid of overwhelming the API if everyone follows this). But I'm totally fine reverting to 1s if this is deemed appropriate. Either way, I'll update the docs accordingly.

What do you think to using the async invocation mode in the connector-sdk

Sounds reasonable. Any drawbacks of this approach vs sync (besides the "immediate" feedback?)

embano1 commented 4 years ago

Created https://github.com/openfaas-incubator/connector-sdk/issues/45 to discuss intervals/timeouts in the SDK.

alexellis commented 4 years ago

1s (default) seemed rather short (was afraid of overwhelming the API if everyone follows this). But I'm totally fine reverting to 1s if this is deemed appropriate. Either way, I'll update the docs accordingly.

Wow was it really 1 second? That is not a good choice in retrospect. So I'm totally fine with bumping it higher, thank you for the suggestion.

Sounds reasonable. Any drawbacks of this approach vs sync (besides the "immediate" feedback?)

In your scenario where you may be running for 10s per invocation the async approach would fire and forget, so keep up with the system emitting events (vcenter), but without it, it could potentially fail if many events arrived at once.

alexellis commented 4 years ago

I'm seeing different values to the one you mentioned?

Screenshot 2019-12-02 at 16 21 21

alexellis commented 4 years ago

Seems like we had 15s upstream to match the default watchdog setting + some grace, then topic map rebuilds every 10s?

embano1 commented 4 years ago

Seems like we had 15s upstream to match the default watchdog setting + some grace, then topic map rebuilds every 10s?

Thx, I was checking the connector-sdk for examples and the only one provided was for the tester. Let me revert.

In your scenario where you may be running for 10s per invocation the async approach would fire and forget, so keep up with the system emitting events (vcenter), but without it, it could potentially fail if many events arrived at once.

That was also a concern I had, especially when using interpreted runtimes (PowerShell, PowerCLI). The drawback is I don't see function responses (just err on invocation or 202 Accepted, which for now seems to be fine.

embano1 commented 4 years ago

Running final test with the recent changes and will update this comment accordingly. Do you want me to squash the commits before merging?

[Update] Tests passed and updated the first comment accordingly.

alexellis commented 4 years ago

Given that the commits add unwanted behaviour then revert it, I think it'd be better to squash that out of the history completely. :+1:

embano1 commented 4 years ago

I just want to leave a final comment here on async invocation. I think async is the way to go for this connector to not limit throughput. However, I came across this issue https://github.com/openfaas/faas/issues/1298 where I think the author is correct and this behavior could cause issues especially when troubleshooting.

The only information the connector receives upon function invocation is basically the err status of the underlying http call (http.Client.Do()). If the connection to the gateway is intact but the triggered function is not available (anymore [1]) and this change has not been synced to the topic map the response would still be "202" signaling accept (which per REST standard is not useful at all during troubleshooting). What's the recommended way to ease troubleshooting with connectors using async invocation?

[1] E.g. (to be) deleted during rolling upgrade or not accepting new connections

embano1 commented 4 years ago

Thx @alexellis

Approved, but please make the async invocation an optional 12-factor configuration via environment variable.

Good point. I can file a follow-up commit making async optional via -async flag. Right now the connector mainly uses flag for configuration so I think it'd be a good candidate for a flag.

Given that sync makes it easier to troubleshoot and debug function invocations, especially for new users, I'll default -async to false and update the docs accordingly.

Let me know if this works and I'll create a PR.