easegress-io / easegress

A Cloud Native traffic orchestration system
https://megaease.com/easegress/
Apache License 2.0
5.77k stars 498 forks source link

WASM:The new plugin mechanism for the Easegress #1

Closed zhao-kun closed 3 years ago

zhao-kun commented 3 years ago

Backgroud

Our flagship product Easegress(EaseGateway) has many splendid features, it is fascinating, especially our pipeline/plugin mechanism which empower customers to achieve their specific goal with the Easegress customizing way But the current pipeline/plugin mechanism still has too many barriers to use If a user really wants to extend the Easegress he needs to conquer the following issues:

  1. Master the Go language
  2. Master and understand low level pipeline/plugin mechanism
  3. Commit changes to the Easegress repository and rebuild the Easegress server
  4. Deploy Easegress, need to reboot the Easegress server

I think the last two of these barriers are the biggest obstacles for users to extend the Easegress. So I think we need another pipeline/plugin mechanism for the EG customization.

Goal

Compare with other gateway productions, we can found they are all choosing a solution that is embedding a weak language to enhance the power of extensibility. but there are serval cons in these implementations.

If we want to provide a more flexible customization mechanism, we must solve the above disadvantages.

Proposal

After several days of study, I found we can leverage WebAssembly to solve the above problems.(被打脸了……), because the WebAssembly has the following feature:

Golang has rich ecology, I found an open-source Golang WebAssembly runtime library at [1].

PS: I don't want to deprecate the current pipeline/plugin mechanism, but I think we need multiple customized abstraction, the different way to process the different scene. This solution has been adopted by Envoy as its filter's extensibility [2].

[1] https://github.com/wasmerio/wasmer-go [2] https://www.envoyproxy.io/docs/envoy/latest/start/sandboxes/wasm-cc

xxx7xxxx commented 3 years ago

Thanks for the great proposal.

Definitely, I considered the embedded Lua script as OpenResty. But I don't think its performance is the biggest problem, its real problem in EG are:

  1. Lacks good expression so it's hard to write complex business logic
  2. Not good community in cloud-native(its domain is also in game)

A little history here: We did design the python/shell interpreter plugin before, then eliminated them because of the performance.

About the proposal, If EG wants to adapts the WebAssembly, I think we need to do more research, the first problem is if it can handle the complex interface[1]. If the answer is yes, we could try to write a simple plugin by WA to verify the feasibility.

[1] https://github.com/megaease/easegateway/blob/master/pkg/context/httpcontext.go#L27

zhao-kun commented 3 years ago

I clarify my proposal again, I don't want to deprecate the original extensible mechanism. I just want to add new way to process the different situations. In my mind, I think:

We don't need to start research from the hardest way. Simple is good.

xxx7xxxx commented 3 years ago

Well, why not. But I will try to make WebAssembly has the same interface as the existed plugin as much as possible, which empowers more ability to the external plugins.

haoel commented 3 years ago

I think this is a good try.

and we need to make sure the wasm function interface is the same as the built-in plugins.

benja-wu commented 3 years ago

If our X-problem is to satisfy the user's complicated business logic, and it also has a simple EG inner interaction which may be to accept the traffic from EG locally, why not use FaaS to solve our problem. In this issue I had proposed to move the original FaaSService into one EG controller that users can encapsulate their business logic into an image then EG will fire them up if necessary.

BTW, WASM is definitely a very attractive and cool way to achieve our goal here, I can accept that if we decide to use WebAssembly and WASI at last.

zhao-kun commented 3 years ago

If our X-problem is to satisfy the user's complicated business logic, and it also has a simple EG inner interaction which may be to accept the traffic from EG locally, why not use FaaS to solve our problem. In this issue I had proposed to move the original FaaSService into one EG controller that users can encapsulate their business logic into an image then EG will fire them up if necessary.

BTW, WASM is definitely a very attractive and cool way to achieve our goal here, I can accept that if we decide to use WebAssembly and WASM at last.

No! the FaaS serves for different domain, I don't think we can leverage FaaS to handle all extensibilities of EG as following considerations:

benja-wu commented 3 years ago

If our X-problem is to satisfy the user's complicated business logic, and it also has a simple EG inner interaction which may be to accept the traffic from EG locally, why not use FaaS to solve our problem. In this issue I had proposed to move the original FaaSService into one EG controller that users can encapsulate their business logic into an image then EG will fire them up if necessary. BTW, WASM is definitely a very attractive and cool way to achieve our goal here, I can accept that if we decide to use WebAssembly and WASM at last.

No! the FaaS serves for different domain, I don't think we can leverage FaaS to handle all extensibilities of EG as following considerations:

  • FaaS is too heavyweight to use, introduce FaaS will bring the complicated operation
  • FaaS is poor performance compare with WASM, I don't think it can handle high throughput situation, especially in the common Easegateway usage (Traffic Gateway)

Well, I think FaaS and WASM will be combined into one common solution for effective serverless computing in the future.[1] Actually, there are several projects aiming at achieving this goal already.[2][3]

Reference

[1] Challenges and Opportunities for Efficient Serverless Computing at the Edge https://www2.seas.gwu.edu/~gparmer/publications/srds19awsm.pdf [2] faasm https://github.com/faasm/faasm [3] WebAssembly-based FaaS https://www.secondstate.io/faas/

haoel commented 3 years ago

I don't think there FaaS & WASM would be one solution. They are focus on different scenarios.

Scalability is a major difference, please do not mix them together.

benja-wu commented 3 years ago

I don't think there FaaS & WASM would be one solution. They are focus on different scenarios.

  • FaaS - it belongs to the Serverless domain, it's scalable.
  • WASM - it belongs to the language domain, it's not scalable.

Scalability is a major difference, please do not mix them together.

Sorry, I mess them up. But what I try to describe is that in our X-problem scenario, FaaS+WASM can be another possible solution, not just WASM plugin solution since we had support part of FaaS ability in EG already. That's actually what I want to discuss about.
And I also updated my previous comment already.

zhao-kun commented 3 years ago

Reopened

localvar commented 3 years ago

Currently, all plugins(filters) in EG follows implements the same interface Filter, and the prototype of its core method is:

func Handle(ctx context.HttpContext) (result string)

So, for the new WASM plugin:

As ctx is the only parameter of Handle and contains all context of the call, EG needs to pass it to WASM and WASM needs to pass it to EG APIs. Consider the ctx object is very complex and may change from time to time, it is better to hide its details to the WASM code.

There're several ways to pass ctx to WASM, both with pros and cons.

I tend to the 3rd solution and will switch to the 1st one when wasmer support of reference types becomes ready, note the switch introduces incompatible changes to WASM code. Or, can we just delay this feature and wait for reference types from wasmer? It should not be a long time as the code changes in wasmer have been merged already.

Any comments?

xxx7xxxx commented 3 years ago

3 is my choice too. But in 1, any issue is tracking the feature?

benja-wu commented 3 years ago

3 is my choice too. But in 1, any issue is tracking the feature?

Agreed. I had proposed the Golang envoy type example to @localvar already, which are proxy-wasm and proxy-wasm-go-host

So I think maybe we can transfer EG's HTTPContext to Proxy-WASM's Context then reuse the project above. It's an implementation of bomin's 3rd solution. Or maybe we could wait for the Reference Type of WASM to be merged.

IMHO WASM-supported feature is a nice-to-have feature of EG. We can take our time to fully discuss and test it.

zhao-kun commented 3 years ago

What do we exactly need to introduce WASM, can anyone answer me theses question. I think the answer is the key to our election


PS: WHY WASM-support is a NICE-TO-HAVE feature, on the contrary, I think it is a MUST-TO-HAVE feature

xxx7xxxx commented 3 years ago

Answer: Dynamically load business code in any language.

It's a killer feature when it's compared to any existed traffic gateway/platform. So it's a must-to-have. We're being on the same page.

localvar commented 3 years ago

Ok, then I will begin working on the 3rd solution.

zhao-kun commented 3 years ago

Wait, hold on!

If we think it's a killer feature, and introduce the new technology serves for our product feature, I don't think 3rd solution is suitable for our feature, we must not implement integrating with WASM at the expense of hurting performance or friendly user experience

benja-wu commented 3 years ago

Answer: Loading user's small/unimportant business logic source code,e.g., converting a pic to another format, notifying Slack/Github, and so on, in runtime without heavy operations( e.g., allocating a new VM/Container, maintaining VM/Containers). Let's break it down. Why a traffic gateway or API gateway needs to support running the user's small business logic besides? Does it have another solution? Can we bring the user's complex business logic into EG with the help of WASM? My answer is NO.

zhao-kun commented 3 years ago

small business logic IS NOT EQUAL unimportant business logic

The feature of rapidding development small business logic IS IMPORTANT FOR TRAFFIC SPECIFIC SECOND DEVELOPMENT PLATFORM

donge commented 3 years ago

An interesting discussion which I looked into Envoy and Traefik ... before. you guys can have some idea on https://github.com/envoyproxy/envoy/issues/15152 as the comparison, no perfect solution. WASM is good on isolation, but bad on performance. At least on current time. Lua is the easy old way, but can not afford a complex logic or module. Golang lib is somehow a middle ground solution, and further is good choice on total golang solution.

Anyway, if you guys can make a zero-cost import (no import, no cost), these three solutions can be used together like Traefki :)

j4ckzh0u commented 3 years ago

Why not consider Quickjs ? which is used by pipy(an open source webgateway).

benja-wu commented 3 years ago

An interesting discussion which I looked into Envoy and Traefik ... before. you guys can have some idea on envoyproxy/envoy#15152 as the comparison, no perfect solution. WASM is good on isolation, but bad on performance. At least on current time. Lua is the easy old way, but can not afford a complex logic or module. Golang lib is somehow a middle ground solution, and further is good choice on total golang solution.

Anyway, if you guys can make a zero-cost import (no import, no cost), these three solutions can be used together like Traefki :)

Thanks for your suggestion, first, let me clarify some background requirements here.

  1. We want to be language independent so that users won't' have to be familiar with Golang.
  2. We want to be infrastructure transparent so that users won't need to go into Easegress's implementation mechanism. (BTW, if users want to develop a customize filter/controller, then it's necessary. And that's another scenario :-) )

So as far as I am concerned, WASM might be a more suitable choice for us, it is more isolated and efficient in this scenario. Speak of WASM's performance, let's compare it with the docker/VM type solution here. I think then we can have the conclusion that WASM has better performance. That's fairer to WASM. :-)

bigangryrobot commented 3 years ago

Answer: Loading user's small/unimportant business logic source code,e.g., converting a pic to another format, notifying Slack/Github, and so on, in runtime without heavy operations( e.g., allocating a new VM/Container, maintaining VM/Containers). Let's break it down. Why a traffic gateway or API gateway needs to support running the user's small business logic besides? Does it have another solution? Can we bring the user's complex business logic into EG with the help of WASM? My answer is NO.

  • EG focuses on scheduling traffic efficiency and becomes a reliable and traffic-specified platform for second development.
  • The user's complex business logic will introduce more unpredictable maintenance costs into EG.
  • WASM solution is not suitable for complex user's business logic implementation right now.

Really strongly agree with the statements here and I would say that most usecases could be fulfilled with the RemoteFilter type and pushing users to provide simple micro services that satisfy in-flow business logic processing

bigangryrobot commented 3 years ago

another idea perhaps is to follow the pattern that telegraf uses for its plugins https://github.com/influxdata/telegraf/tree/master/plugins

benja-wu commented 3 years ago

@bigangryrobot Thx, bro. Feel good to be strongly agreed. :metal:

Reference

[1] https://wasi.dev

bigangryrobot commented 3 years ago

Perhaps another direction is to go with a filter module that parses lua https://github.com/Shopify/go-lua. This would enable folks to upload lua blobs and then consume them with a lua filter step in the pipeline.

Allowing for language of choice is great, but im not sure its a worthy goal as the complexity is just rough. Lua might be a middleground

Id still say that most business logic should be separate from easegres, but im still getting to know the goals of the product. The key piece for me is that when implementing external, unverified codeparts directly, we decrease stability. As an example, jenkins, while a terrible mess of a product that it is, can work fairly well on its own, but once you start bringing in someone else's plugin the pains of java GC and code smell increase dramatically until with enough plugins it just is barely usable

The core of easegress should be so solid and verifiable that its responses are expected and guaranteed. The underlying services providing should abide by strict interface standards and have execution outside of the existing core process