apache / openwhisk

Apache OpenWhisk is an open source serverless cloud platform
https://openwhisk.apache.org/
Apache License 2.0
6.5k stars 1.16k forks source link

Discussion: whisk object vs. context parameter in javascript actions #616

Closed sjfink closed 7 years ago

sjfink commented 8 years ago

@domdom82, @csantanapr and @tuler have observed that our current javascript programming model's 'whisk' object probably makes local debugging of javascript actions more difficult than it need be.

To review, the 'whisk' object currently serves two purposes:

  1. It holds environment bindings such as "auth key" and "action name"
  2. It encapsulates some functions such as whisk.done(), whisk.async(), and whisk.invoke()

For local testing, I think it's probably accurate that it's easier to pass an object to a function than arrange for a global 'whisk' object to be in scope.

Let's discuss possible changes to the programming model here.

Some observations: 1) There are some possible solutions that use the openwhisk npm module.
2) whisk.done(), async(), etc. are due to be deprecated with Node 6 shortly.

tuler commented 8 years ago

I think that comparing to other serverless architectures is important because it improves the understanding and adoption of openwhisk.

Regarding the callback, I agree that #56 is very welcome, especially for modern node runtimes.

Comparing to AWS Lambda

AWS tweaked the programming model in the node 4.3 runtime, as opposed to the 0.10 runtime. Currently the function prototype is:

function (event, context, callback) http://docs.aws.amazon.com/lambda/latest/dg/nodejs-prog-model-handler.html

The Context object holds metadata of the function and about the execution environment, including auth.

The user has to call callback(error, result), so it's async-only.

Comparing to Google Cloud Function

Google requires the user to package it's function as a npm module. The function prototype is: function (context, data) https://cloud.google.com/functions/writing

The context provides methods for function termination, context.done, much like openwhisk. Data is the event.

Comparing to Microsoft Azure Function

Azure's function prototype is event shorter: function(context) https://azure.microsoft.com/en-us/documentation/articles/functions-reference-node/

You end execution with context.done. You get event data also from the context object.

csantanapr commented 8 years ago

This is concern that I have, but I don't know if its part of this issue or it can be handle in a different issue.

I would like to have the ability to use export and not forced to have a global main function.

we can design what should be exported I'm open to brainstorm on alternatives like convention export.main = function () {..} or export.handler = function () {..} export = function() { ..}

the benefit is that it's easier to use things like webpack, and other bundlers, and also for using require when it comes to unit testing.

tuler commented 8 years ago

Global main is definitely not ideal. Export is the way to go. It also enables the use of several functions in the same file.

I think webpack'ing is a nice option, but I think it should not be required. Some options to consider:

csantanapr commented 8 years ago

@tuler webpack, was just an example, I agree it should NOT be required.

And the other alternatives like using a zip is good to have.

I didn't get your 3rd example

or even zip including package.json and 'npm install'ed by the openwhisk container.

You mean a small zip with only the action.js and package.json? and let whisk do the npm install using the package.json?

tuler commented 8 years ago

Yes. Like you do when you commit your node app to git, excluding node_modules using .gitignore.

But I'm not sold on this idea. Obviously you would have to npm install beforehand, because it takes some time.

csantanapr commented 8 years ago

Yeah the penalty in time to invoke an action to get the container warm you would need to wait for npm install, and also hope npm registry or any url your fetching is up :-)

But for simple debugging and dev time, it might be useful. I have being thinking that there might be a prod vs. dev invocations or creations, that allows dev to hack away even open a port to the debugger so they can breakpoint and everything from their IDE/editor.

mbehrendt commented 8 years ago

i think there is benefit in not doing npm install on the server side, since it forces the developer to send over exactly what he wants to be run. i've seen too often issues due to code working on the client side, but didn't work on the server-side since different versions of dependencies were pulled

tuler commented 8 years ago

I think I agree @mbehrendt

Regarding the invoke penalty, I suggested this because I was playing with an architecture that I build a docker image for each function (from a base runtime image) and register it in a private registry. So each function has its docker image. The name of the image is the name of the function.

mbehrendt commented 8 years ago

yeah, that is a good point. i think in some other issue, @csantanapr had suggested that as well? adding @perryibm , since there might be perf considerations around this

csantanapr commented 8 years ago

I can't find the issue now, but in general if I was going into production with an app that every ms counts and I need the highest level of control using a container will be the way to go.

and app with multiple dependencies then you can burn this into the container image, no need to fetch the zip of dependencies on cold start. more control on what version of the runtime language if you are required to use specific version nodejs:4.2.1, or a special binary of nodejs with static link compiled dependencies.

More effort for a novice developer to get this level of control and setup, but this is where some tooling, samples, templates, sdk can come handy to abstract those specific to metadata and developer just worry about the code that will eventually goes into the container.

perryibm commented 8 years ago

Unfortunately, using many docker images wouldn't be free either because the image has to get pulled from some registry. Since we have many invokers and things are quite dynamic, any particular activation might end up having to pull an image as part of its cold start. It might be useful to intercept in the middle and freeze an image after the dependencies but before the user code and re-use these images which are more likely to be share-able.

csantanapr commented 8 years ago

One way to overcome, some of this is to use a single image, and build monolitic app with a facade, where it's actually the same docker image, creating multiple actions with same artifact where bound parameters decide which code to run. or creating a single action that handles all events.

Yeah I agree cold start of unknown image will need to get downloaded, but once download I believe we cache the image.

tuler commented 8 years ago

I don't see pulling the image as a big problem if a private registry is part of the architecture. But you have a better understanding of openwhisk architecture to judge it, I'm just starting to play with the code.

Sorry to divert from the original discussion, that's why I suggested to create a gitter channel :-) #615

sjfink commented 8 years ago

I'd like to continue this discussion (focused on the original topic of this issue) here:

https://github.com/openwhisk/openwhisk/wiki/Design-Discussion-on-javascript-context

rabbah commented 7 years ago

For node, actions can export a main already and do not need to declare a global main. We've also changed the action.invoke and trigger.fire to operate with promises.

For node, I propose we install into the base the npm openwhisk package and deprecate the whisk object. Context parameters will be made available as environment variables (in addition to AUTH_KEY, andEDGE_HOST, I propose adding NAMESPACE and ACTION_NAME).

The passing of context parameters via environment object is how we're currently injecting these values into other runtimes; one value per property name.

Proposed plan:

  1. [x] install openwhisk npm package pinned of course
  2. [x] add deprecation notice to whisk object in node actions (emit log message on use)
  3. [x] add current "context" to Node and Java actions (makes all runtimes the same)
  4. [x] prefix openwhisk "context" environment variables with __OW_ in all runtimes
  5. [x] add namespace, deadline and other useful metadata to action environment
  6. [x] add deprecation notice to use existing environment variables?
  7. [x] migrate examples to use openwhisk package (see https://github.com/openwhisk/openwhisk/issues/616#issuecomment-262132467)
  8. [ ] migrate catalog actions as well (https://github.com/openwhisk/openwhisk-catalog/issues/157)

Eventually, remove whisk object and environment variables from node actions. Thumbs up or down to accept proposal, comments to refine.

csantanapr commented 7 years ago

I give 👍 to the proposal

Just a refinement, I think this is more for the implementation details of the openwhisk npm module on how to pick up the env variables.

This environment variables could be use by the openwhisk npm module when not running in an action in real runtime, to configure the module.

I would like to see an environment variable that provides more info on the whisk edge point to include protocol, and port, this way I can use it for mocking and testing. meaning I want to be able to run a server on 'http://localhost:3000' that mocks the whisk api for testing, and my actions to run them locally which whisk.invoke() and it will send the request to 'http://localhost:3000' instead of 'https://openwhisk.ng.bluemix.net'. If we add 'http' to nginx edge for local dev it can be use to point to this also.

maybe making __OW_EDGE_HOST contain protocol and port info, or have that info in different variables.

rabbah commented 7 years ago

EDGE_HOST already includes the port. We made a decision long back to only support https at the edge. Hence we striped protocol out. I suggested in the openwhisk-client-js repo a heuristic that satisfies the request you're making. I've also used it in the test attached to the pull request for this issue.

mbehrendt commented 7 years ago

one quick question -- how would this proposal support situations where the variable value changes while the action invocation is ongoing?

markusthoemmes commented 7 years ago

@mbehrendt Which values do you have in mind?

jthomas commented 7 years ago

Discussing this with @rabbah, I'm going to implement the changes to support https://github.com/openwhisk/openwhisk-client-js/issues/1 and https://github.com/openwhisk/openwhisk-client-js/issues/14.

I also had an idea to allow the module constructor to automatically pick up configuration from environment parameters without manual configuration.

var ow = openwhisk()
// ... without explicit params, look for __OW__* variables in the environment.

The user can still provide this configuration explicitly to override the defaults or work outside the OpenWhisk platform.

It will simplify the user experience and reduce the possibility developers make mistakes passing in the configuration, leading to random bugs.

csantanapr commented 7 years ago

@jthomas yes, this was I had in mind 👍 for changes to the client js library constructor, use env ad default or override explicit

csantanapr commented 7 years ago

@rabbah this might be orthogonal to this issue but wanted to comment. I think having the option to export the action as the module simplifies things a bit IMHO

module.exports = function (params) {return params};

I left the details of my proposal in my comment in issue #47 https://github.com/openwhisk/openwhisk/issues/47#issuecomment-261955764 As it doesn't exclude to still have exports.main or exports.handler

rabbah commented 7 years ago

Defer to 47 - I don't see what It has to do with this issue.

jthomas commented 7 years ago

@csantanapr great, I've added all the changes now with 2.3.0

rabbah commented 7 years ago

Pull request #1539 completes the relevant changes for this issue per my proposal above. Remaining are catalog action migrations (https://github.com/openwhisk/openwhisk-catalog/issues/157) and rewriting the following test actions to use npm openwhisk:

** added todo to remove this asset and corresponding test, no longer valid with adoption of npm openwhisk (conferred with @bjustin-ibm).

rabbah commented 7 years ago

Fixed by https://github.com/openwhisk/openwhisk/commit/85070a20f560e77ea064ebfc57f2c1eb4608f420.