Closed Mossaka closed 1 year ago
All of these make sense to me except for gRPC, which feels out of place and niche.
While gRPC gets used in the Kubernetes world, I can't think of many other contexts where it gets broad use. I can't think of any tier-1 cloud providers that require gRPC for communication (and even Kubernetes uses it only in very low level contexts that most developers never see) or which offer services that require gRPC to be able to use them. Additionally, there are many competing technologies that do what gRPC does, and there are good reasons for adopting those. Additionally, gRPC is particularly fragile and prone to breaking changes and incompatible implementations.
The rest of the proposed bindings refer to tremendously well-adopted "table stakes" features, and provide general purpose implementations.
Is there a powerful reason to include gRPC in what is otherwise a generic set of "table stakes" features? Otherwise, it feels like requiring it may prove to be a hinderance for adoption, especially in ecosystems that have no use for it or actively avoid it.
Otherwise, I'm pretty excited about this proposal.
Also, I love the name.
BTW, where does this name come from? Perhaps I am out of touch with the world of cloud and this is something most folks will immediately understand? (I tried googling "cloud" with "bursty" but it didn't seem helpful).
My assumption was that it came from "bursty workloads" -- e.g. the kind of unit of work that handles computing in short-lived bursts rather than constantly. Examples: handling an HTTP request, responding on a message queue, listening for an event, and so on.
This would include patterns like:
Opposed to bursty workloads would be anything that is expected to run continuously, or to handle very large operations that consume hours rather than seconds.
Again, though, this is my reading of a name that @Mossaka came up with. So he may drop in here and tell me I'm entirely wrong.
it could be bursty workload, too, easily
Is there a powerful reason to include gRPC in what is otherwise a generic set of "table stakes" features? Otherwise, it feels like requiring it may prove to be a hinderance for adoption, especially in ecosystems that have no use for it or actively avoid it.
Just to clarify, wasi-grpc is a proposal that hasn't been materialized yet, and this is the reason why I didn't put wasi-grpc in the bursty world proposal. AFAIK there are two primary modles for service to service/client communications: REST (HTTP) and RPC (source: Designing Data-Intensive Application chapter 4). Over the years, RPC has been evolved and among them, gRPC emerges as a RPC framework that uses Protocol Buffers encoding schemes. It could be argued that what we really want to abstract is the RPC model instead of gRPC. But what's fascinating about gRPC is that it supports streams
communication.
Back to your question. Yes I agree that gRPC has compatibility, complexity, and performance issues. Some older system may not support the latest gRPC frameworks or some langauges are not supported by the gRPC codegen. But I still see values in wasi-gprc proposal itself, albeit it may have a hard time integrating itself into the bigger world unlike other wasi-cloud proposals. Does this sound reasonable to you?
Again, though, this is my reading of a name that @Mossaka came up with. So he may drop in here and tell me I'm entirely wrong.
The name is contributed by @squillace . It really is saying that this world is designed for serverless / edge function model.
Again, though, this is my reading of a name that @Mossaka came up with. So he may drop in here and tell me I'm entirely wrong.
The name is contributed by @squillace . It really is saying that this world is designed for serverless / edge function model.
But is "bursty" a common word that used in the serverless / edge world (no pun indented)? I've never heard of it before, but maybe thats not surprising because its not an area I work in often.
If its not a common word, then would it make sense to use a word that is more commonly known? Why not just cloud?
Historically there have been conversations about supporting an include <world>
WIT syntax to enable the union-ing of worlds supplemented by a with
syntax to reconcile name conflicts. Sounds like this proposal is a good motivation to formalize some of those past ideas!
Again, though, this is my reading of a name that @Mossaka came up with. So he may drop in here and tell me I'm entirely wrong.
The name is contributed by @squillace . It really is saying that this world is designed for serverless / edge function model.
But is "bursty" a common word that used in the serverless / edge world (no pun indented)? I've never heard of it before, but maybe thats not surprising because its not an area I work in often.
If its not a common word, then would it make sense to use a word that is more commonly known? Why not just cloud?
no, @sbc100, bursty world
is not a term of art, but rather a way to specify "fast-firing functions". The pattern would be "serverless" generally, but unlike that term it does not imply any process/module hangs around much at all. Hence, "bursty" rather than functions or serverless which don't really scope based on time of execution.
the cloud
world is one I wanted to avoid, because the major cloud providers have 120+ services available, and that is definitely not "scope" :-)
That said, as naming is considered hard, just trying to drive in on a catchy but metaphorically appropriate name for the scope of work..... we could say "wasi-fast-functions" for example, but somehow it just isn't as suggestive.....
I see, thanks for the explanation. I agree that "cloud" is not the right name for the thing your are describing.
It sounds like "bursty" might not the right name either though since (to me at least) its doesn't imply stateless-ness or throwaway-ness or lightwight-ness (which seem to be what you trying to get at?)
Can you explain more why you don't like "serverless"? To me it does imply those things, and I'm not sure what you mean by "does not imply any process/module hangs around much at all". I thought serverless kind of does imply that. Is the idea with this proposal that the process/module would, by definition, not stick around for more than one request?
Of course we don't need to all the bike-shedding here and now.. and as you say, naming is hard.
I like the terms "bursty" and "serverless" since they're both evocative of these ephemeral, quick-starting, auto-scaling-to-zero instances that I think a lot of us are imagining. That being said, there are a lot of different worlds that these qualities will apply to, including the existing wasi-http proposal. Moreover, with the wasi:http/proxy
world (and I think the same logic applies here), there is no reason that a host must make instances short-lived or auto-scaled; the host can deliver as many or few events to a given instance as it wants and keep them alive/warm for as long as it wants (or likely make this configurable by per deployment). Thus, while I think "serverless" and "bursty" are qualities we want these worlds to have, I don't know if they are the defining properties of a single world.
In general, I'd suggest that we try to think of world names that describe the collection of functionality that is being exposed. That's tricky with the large set of functionality that Joe listed above. Just to throw in my 2c: one term that is admittedly amorphous, but maybe in the right way (given that we expect this set to grow over time), could be "service
". If service
feels overly general, an adjective that maybe makes sense is "persistent
" (as in wasi-persistent-service
), since a theme across messaging, kv, blob and sql is that these are all forms of persistent storage. ¯\(ツ)/¯
no problem bikeshedding on a friday. That's what friday's are for. the main conceptual boundary I'm trying to draw is between serverless generally, which can include very long running processes and thus remain alive for minutes or more, and fast firing functions, which almost certainly do not live much longer than a minute.
The former encompass "durable" functions that can be suspended and reanimated and the latter are essentially the generalized case of "CDN functions" that have if not milliseconds then only seconds to live, possibly to a minute. There are domain differences between the two approaches, though the design pattern seems the same. In bursty workloads, one of the main points is recycling the resources essentially or literally per request, which means resource efficiency does not require threading or connection management and so on.
serverless, however, might include precisely these things, depending on how they are being used. OR.... so I have been thinking. But as I said, bikeshedding naming on a friday is a good thing, so all thoughts here are flexible.
related to @lukewagner's comment that just sailed in, yes, I think persistence is a thing that might place some shapes in a different "domain" and hence world. OH! and to give even more context: with respecto bursty worlds, the list of capabilities here reflect what our customers are asking for in this precise world they describe. They say this is the 80% sweet spot -- more isn't needed.
This doesn't mean we cannot add or remove any, but just to give context for the source of the choices.....
if you're going to include RPC, why not use https://capnproto.org/rpc.html afaict it's more powerful than gRPC...though tbh the fact that there are multiple choices for which RPC protocol to use indicates to me more that none of them should be included by default
Thanks for sharing this proposal.
GRPC
I am also strongly in favor of removing GRPC from the scope.
GRPC looks out of place from my POV for multiple reasons. GRPC is a popular but specific and opinionated implementation of an RPC mechanism. I would expect such WASI world to focus provide abstraction that can allow to plug various implementations (like the other proposals listed here).
And isn't the abstraction for RPC systems WASI Component Model itself? Sure each RPC systems could require some specific parts, but again not sure we have to put those specific in such general purpose proposal. Since wasi-grpc
has not yet materialized, I am not sure we can make an educated choice of including it.
Name and scope
While we are I think a majority of stakeholders planning to implement Serverless platforms with the proposals listed above, I am not sure this world is Serverless specific. The proposals listed looks like suitable for implementing long running workload as well, and with the generalization of scale to zero capabilities, I am not sure we will be able to draw a clear line between those 2 use cases, and we don't necessarily have to.
Also maybe better to use a name that can be understood without too much context. I find wasi-command
or wasi-cli
self descriptive and easy to understand, and I have concerns that a lot of people could be confused by wasi-bursty
even if I find the name pretty cool.
While I have not yet concrete proposal for a better name, wasi-cloud
looks like the more straightforward name I could think about given the pretty broad scope proposed. Could somebody share the rational behind not using it despite the description mentioned by @Mossaka being "we want to create a World for all the wasi-cloud proposals"? Is there concerns that it looks like too traditional Cloud vendor-ish or that Cloud could become an outdated term at some point? Edit: I initially missed @squillace feedback on concerns about cloud being too broad scope, point taken.
I tend to agree with what @lukewagner said on how a name should be chosen, but the current pretty broad scope make it hard to choose a meaningful name. A subset of those proposals could maybe be called wasi-persistence-world
or wasi-data-world
but the current scope proposed is much wider than that.
If we, as a group, are sure that the current scope (hopefully minus GRPC) makes sense and that a consensus between different vendors providing multiple implementations is possible, wasi-service
is maybe a bit generic. wasi-remote-service
could maybe be an option but not sure yet.
Something like wasi-cloud-core
could also potentially be a reasonable option in order to take in account @squillace concern on the too broad scope implied by wasi-cloud
.
This name is IMO self descriptive, and clearly communicates the intent of providing a minimal set or core cloud services that providers should implement, likely providing more on top of that.
Here core
is used as a profile, and let the possibility to provide different ones that could be even supersets later. The key point being that different stakeholders like Microsoft, Fastly, Fermyon, Cosmonic, VMware and others agree on a minimal core that will make sense in most cases.
The scope would be the one originally proposed minus grpc.
In practice, most WASI Cloud platforms will be Serverless and "bursty" given the characteristics of Wasm, but wasi-cloud-core
name would be relevant for any kind of runtime/billing model made possible by the proposed API surface.
I like the idea of wasi-cloud-core
and can confirm that grpc will not be in-scope.
I have updated the proposed world name to wasi-cloud-core
and removed gRPC from the scope.
Using the proposed syntax from Proposal: Union of Worlds, the new wasi-cloud-core
World would look more like the following
world wasi-cloud-core {
include wasi-keyvalue
include wasi-blob-store
include wasi-sql
include wasi-messaging
include wasi-runtime-config
include wasi-distributed-lock-service
import default-upstream-HTTP: wasi-http.outgoing-handler
export HTTP: wasi-http.incoming-handler
}
booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
but I defer to the wisdom of crowds
Looks great, easy to understand and likely to reach wide adoption from my POV.
The name cloud
: "cloud" indeed carries a lot of baggage, and the "core set of features" to expect from a "cloud provider" will vary depending on who you're asking and what kind of workloads you are trying to move to this cloud. A name that wouldn't carry that much scope is preferred. Looking at the set of interfaces suggested, it looks like this world is centered around http serving? What about wasi-serving
?
The proposal seems to be missing some logging interface. I'd expect to see something like include wasi-logging
. Is this implicit?
How can we avoid scope creep? if the goal is to define something that is "core", its set of interfaces should probably be kept to a minimum and heavily justified. Should SQL really be considered "core"? Could the core be kept super minimal, and bigger worlds created on top of it? Maybe the core only has the following:
default world wasi-serving-core {
include wasi-logging
import default-upstream-HTTP: wasi-http.outgoing-handler
export HTTP: wasi-http.incoming-handler
}
Hi @steren, darned good questions all. I'll take one swipe, let's see what others say:
cloud
does carry baggage, you're right. The intent was to have the smallest set of "service shapes" available for the widest array of serverless functions at first (the "core" part). I had wanted to call it "bursty world" but was voted down as the shape doesn't convey speed of the surface. So "serving" would be fine, but is there a tighter way to express the intent here? OR.... is the shape the shape, and the intent is something that appears in implementations?wasi-logging
is part of the overall wasi space and is DEFINITELY assumed -- but isn't really part of this surface. The way we've discussed it is that any implementation would compose wasi-cloud-core
and wasi-logging
(however that ends up being surfaced) and thence you would always have it (for most cases anyone would think of). using wasi-cloud-core
to using quite a few things individually. That said as a first response, these are all questions we're noodling our way through using this space as a working space.
I agreed to what @squillace said above. I want to elaborate more on the second point and the third point.
I'd expect to see something like include wasi-logging. Is this implicit?
I think it is implicit. The way the include
works is that it adds the imports / exports from that world to this world, and hence the name "union of worlds". wasi-logging
is so fundamental that many of the smaller worlds already assume it as a dependency. e.g. the wasi-http
has a dep on wasi-logging.
Do you prefer we explictly add wasi-logging to the wasi-cloud-core
world? This is fine because the include
de-duplication resolver will figure out that wasi-logging has been transiently added by other worlds that are included in this world.
How can we avoid scope creep?
I like the idea that any interfaces that are included in the wasi-cloud-core
needs to be heavily justified, but I want to point out there is a world that fits exactly what you described - the http proxy world is. In my view, wasi-cloud-core
is broader than a proxy world. It gives a set of "common" capabilities to developers to build distributed applications and these include the ability to interact with keyvalue stores, upload and download files from a blob store, exchange messages through pub/sub, and retrieve runtime configurations from vaults etc.
wasi-http has a dep on wasi-logging.
Great, that was my question. I now understand that import default-upstream-HTTP: wasi-http.outgoing-handler
imports wasi-logging
. I don't think it has to be explicitly listed as an import. Maybe having a tool that would list all inherited interfaces of a world would be useful.
I want to point out there is a world that fits exactly what you described - the http proxy world is
Thanks, this indeed matches what I would consider "core" to request serving.
wasi-cloud-core
is broader than a proxy world. It gives a set of "common" capabilities to developers to build distributed applications
That makes sense. Still, I suggest keeping a "core" to the bare minimum. A principle could be something along these lines: "80% of distributed apps are expecting this interface". I would thus suggest to move wasi-runtime-config
and wasi-distributed-lock-service
out of core
, maybe to a broader wasi-cloud-extended
world (Unless you have strong evidence that the large portion of developers will expect these, based on my experience, most apps are fine without these capabilities)
Perhaps something like:
wasi-runtime
: wasi-logging
wasi-runtime-http
: wasi-runtime
, wasi-http
wasi-runtime-storage
: wasi-runtime
, wasi-keyvalue
, wasi-blob-store
, wasi-sql
wasi-runtime-messaging
: wasi-runtime
, wasi-messaging
wasi-cloud-core
: wasi-runtime
I imagine that I would want to run the same services on my laptop as I run in the cloud, so the idea of a "runtime" encapsulates both of those worlds which could be defined as the same thing for now but may diverge.
(also "runtime" is similar terminology to AWS lambda, where I would love to have a WASI runtime some day),
A principle could be something along these lines: "80% of distributed apps are expecting this interface".
I really like this idea that we want to design interfaces to have 80% of the features for working distributed apps. In fact, the Pareto principle is one of the guiding principles when we started the SpiderLightning project, which is a prototype for host implementation of many wasi-cloud-core proposals including keyvalue, messaging etc.
I would thus suggest to move
wasi-runtime-config
andwasi-distributed-lock-service
out of core
I first want to explain the use cases for these two capabiltiies and then see if we agree that these two are fitting the 80% feature sets of distributed apps:
I'd like to hear stakeholders views on whether or not we want to have these two capabilities baked in the wasi-cloud-core
World.
I imagine that I would want to run the same services on my laptop as I run in the cloud, so the idea of a "runtime" encapsulates both of those worlds which could be defined as the same thing for now but may diverge.
Agreed, and in SpiderLightning, we have used OS filesystem to implement wasi-keyvalue, wasi-messaging capabilities and the same application that uses these two capabilities can run in local with filesystem, or can run in production environment with cloud providers.
I am a bit afriad to use "wasi-runtime" as runtime is common referred to Wasm runtimes like Wasmtime and WAMR etc.
This proposal has been accepted to move to a stage-1 WASI proposal. A repo is created under WebAssembly org and here is a link. I will be working on formalizing the spec using the WIT IDL and adding more description to it.
I will be closing this issue and encourage everyone to discuss wasi-cloud-core in that repo.
Thank you all for your support and suggestions!
This follows the footstep of #509 and a few discussions in WASI subgroup meetings, we want to create a World for all the wasi-cloud proposals and name it as
wasi-cloud-core
. This includesCurrently, progress on the above proposals varies in terms of completeness. The proposal specifications for
wasi-keyvalue
,wasi-messaging
,wasi-http
,wasi-sql
, andwasi-blob-store
are more fully formed. In these proposals, some of them are not updated with the WASI preview 2 syntax. By the end of Spring, we plan to add basic proposal specifications for the remaining wasi-cloud proposals and make sure all of the proposal specifications are aligned with the WASI preview 2 syntax. In addition, we want to validate all the WIT files using automated CIs andwasm-tools
, and document any breaks and changes to the specifciation in a change log.To elaborate on aligning the syntax of WIT with that of WASI preview 2, we want to use pseudo-stream/future/resource types, and to continue to align with future versions of WASI, as described in #515.
Each of these proposals have its own proposed WIT interfaces and worlds, but we raise this issue to propose a
wasi-cloud-core
World that has a similar structure to the following:This is just a sketch of the proposed world, and much remains unknown to what it means to import a
wasi-keyvalue
world inside of another world, as imports and exports of a World only allow WIT interfaces. We will also explore the uses of WIT templates in each proposal to help us move runtime implementation to static time implementations.A note on
wasi-cloud-core
overall scopeIt is not meant to cover 100% of the features that distribtued application expecs, but it focuses on the 80% of the problem space with the assumption that most apps will fall into this scope. The API designs for these wasi-cloud proposals aggregate the common features across multiple providers. An example would be that wasi-keyvalue provides
readwrite
APIs which is the lowerest common dominator in all systems and is designed for rapid development expeirence.