envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.11k stars 4.82k forks source link

support oci_uri as a remote data source for wasm #33212

Open arkodg opened 8 months ago

arkodg commented 8 months ago

Title: One line description

Description:

Envoy only supports fetching a WASM module remotely via http

Majority of the WASM Modules now live in OCI Registries, so would be great if an oci field was supported to allow users to fetch WASM modules directly instead of using a shim (an auxiliary control plane creating a temporary file server or a sidecar agent that prefetches from an OCI registry and persists it into the common filesystem) that deals with the CRUD state machine

[optional Relevant Links:]

Any extra documentation required to understand the issue.

yanavlasov commented 8 months ago

Adding @lizan and @mpwarres as area owners

kyessenov commented 8 months ago
  1. There are two formats, e.g. https://cloud.google.com/service-extensions/docs/prepare-plugin-code#container-image, https://github.com/istio-ecosystem/wasm-extensions/blob/master/doc/how-to-build-oci-images.md#overview.
  2. How are secrets going to be distributed? Do you expect CDS+SDS for the remote?
  3. Please fix HTTP and HTTPS URL fetch as well. https://github.com/envoyproxy/envoy/issues/29824
arkodg commented 7 months ago
  1. Format - hoping an expert who's already worked with a WASM OCI Artifact related working group in the past can help with clarity here
  2. requested @zhaohuabing to look into https://github.com/envoyproxy/envoy/issues/29824 when he has some free cycles

@kyessenov @yanavlasov can try and find a contributor from the community to help with this feature, once its been accepted

jewertow commented 7 months ago

requested @zhaohuabing to look into https://github.com/envoyproxy/envoy/issues/29824 when he has some free cycles

@zhaohuabing did you already start or are you going to start working on this issue soon? If not, I could take care of it.

zhaohuabing commented 7 months ago

@jewertow I haven't started on this. Please feel free to move on if you have time.

kyessenov commented 7 months ago

If you're going to implement it, let's settle on the two design issues:

jewertow commented 7 months ago

@kyessenov are your suggestions relevant for HTTPS URI or for OCI? I was going to start with HTTPS URI and I didn't investigate the scope of OCI yet.

kyessenov commented 7 months ago

Yes, they are. OCI is not that different from a regular HTTPS. It's just a tar with structured content and some authentication schemes.

zhaohuabing commented 7 months ago

Should HTTPS be similar to the current HTTP fetching and caching mechanism?

kyessenov commented 7 months ago

@zhaohuabing The current HTTP fetching in Wasm is somewhat broken, we warn people not to use it. If you use a cluster specifier, then HTTPS vs HTTP becomes a configuration in the upstream cluster, so yes, it should be similar.

zhaohuabing commented 7 months ago

@zhaohuabing The current HTTP fetching in Wasm is somewhat broken, we warn people not to use it. If you use a cluster specifier, then HTTPS vs HTTP becomes a configuration in the upstream cluster, so yes, it should be similar.

@kyessenov Could you elaborate on what's broken in HTTP fetching? We plan to use HTTP Wasm code source in EG and want to avoid any known issues. Thanks!

kyessenov commented 7 months ago

@kyessenov Could you elaborate on what's broken in HTTP fetching? We plan to use HTTP Wasm code source in EG and want to avoid any known issues. Thanks!

It crashes :) I don't have enough time to troubleshoot and fix it, but it needs to be fixed before using it in production.

zhaohuabing commented 7 months ago

After testing, I confirmed that HTTPS schema works for the Wasm HTTP code source. It seems that TLS is solely associated with the Cluster configuration and does not require handling within the Wasm filter.

https://github.com/zhaohuabing/playground/blob/2e2d0b49c36eda2a5282cbdeba482455b1b1109a/envoy/wasm/envoy.yaml#L28-L49

jewertow commented 3 weeks ago

@kyessenov I wanted to start working on support for OCI scheme, and since this requires to send a few consecutive requests (GET manifests -> GET blobs -> GET layers), I wonder if I should follow asynchronous pattern used in RemoteDataFetcher? I mean, should I implement fetchers extending Http::AsyncClient::Callbacks for manifests, blobs and layers and call them consecutively in onSuccess? WDYT?