Closed salaboy closed 1 year ago
Thanks for opening this. I think my first thought was something simple as getting a single wasm layer, similar to trivy. Later, and perhaps more important for the outbound binding, it could use more sophisticated layering, but I think we can start with just getting the binary and only that.
e.g. https://github.com/aquasecurity/trivy-module-wordpress https://github.com/aquasecurity/trivy/blob/main/pkg/module/command.go#L15
maybe to match how other CNCF projects like istio we can use the same field name and description https://istio.io/latest/docs/reference/config/proxy_extensions/wasm-plugin/
Field | Type | Description | Required |
---|---|---|---|
url | string | URL of a Wasm module or OCI container. If no scheme is present, defaults to oci://, referencing an OCI image. Other valid schemes are file:// for referencing .wasm module files present locally within the proxy container, and http[s]:// for .wasm module files hosted remotely. | No |
maybe to match how other CNCF projects like istio we can use the same field name and description https://istio.io/latest/docs/reference/config/proxy_extensions/wasm-plugin/
Field Type Description Required url string URL of a Wasm module or OCI container. If no scheme is present, defaults to oci://, referencing an OCI image. Other valid schemes are file:// for referencing .wasm module files present locally within the proxy container, and http[s]:// for .wasm module files hosted remotely. No
Agree.
TL;DR; We shouldn't follow istio as it isn't relevant to real world usage. We should not anchor on how it does proxy-wasm, rather assess what is relevant to real world usage. We only need to allow https for the short term, as that's the only main way wasm is distributed apart from local files.
While this issue makes sense, it is based on a faulty (if easy to fall into) assessment that the design of istio came from user demand (it didn't).
In practice, OCI is rarely used in istio. The main spec isn't a standard rather just one company's document written and not really followed. In the extremely rare cases where OCI is used publicly for proxy-wasm, it is using normal container image with a single file layer of the wasm binary. In the vast majority of cases, wasm is distributed by local files (which we do now), and next normal http fetch (which we don't yet).
I think we only need to support container image consumption IFF an end user asks for it, not a vendor.
Since Wasm hub is not as common used as docker hub, we can use HTTP to get single WASM file for now.
If OCI becomes widely used, we can support it, which is not late.
agree. any OCI repo (wasm hub and otherwise) is rarely used and could be supported in a URI scheme later. Let's focus on http fetch as it also includes no deps, so is basically only config code.
My 2p: using OCI isn't anything beneficial nowadays. In istio I think the idea was to reuse the pulling mechanism of Kubernetes but it is not like that (e.g. try to build your image locally and pull as a wasm plugin). Even companies doing service mesh (i know a couple for sure) are willing to pull the wasm binary itself and avoiding getting into a burden with OCI pulling matters.
@jcchavezs thanks for the feedback, and I meant to ask as it came up in kubecon.. someone said there's a limit of size for k8s configmap files? I suspect the wasm is already out of hand when that occurs, but it matters practically to know. Do you have an idea of what that is (or maybe @salaboy if you do?)
Configmap max size is 1mb. What people do to overcome that is this: https://github.com/corazawaf/coraza-proxy-wasm/pull/154 right now but if we provided a wasmfetcher in an init_container I believe we could pull all the binaries and inject into pods. I would be up for such a thing.
On Tue, 25 Apr 2023, 09:31 Crypt Keeper, @.***> wrote:
@jcchavezs https://github.com/jcchavezs thanks for the feedback, and I meant to ask as it came up in kubecon.. someone said there's a limit of size for k8s configmap files? I suspect the wasm is already out of hand when that occurs, but it matters practically to know. Do you have an idea of what that is (or maybe @salaboy https://github.com/salaboy if you do?)
— Reply to this email directly, view it on GitHub https://github.com/dapr/components-contrib/issues/2700#issuecomment-1521290009, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXOYAQRSDGU5J6NMR2O5NTXC54U7ANCNFSM6AAAAAAWFJCZGY . You are receiving this because you were mentioned.Message ID: @.***>
right now, a wasm larger than a meg is probably a bug not a feature, except for really huge things like python. In any case, http or other options should be used for exactly this reason, it sounds..
I prefer lazy load rather than pull when init.
Another issue is that HTTP is hard to detect WASM changes, maybe we can add detect time interval to do it.
I prefer lazy load rather than pull when init.
Another issue is that HTTP is hard to detect WASM changes, maybe we can add detect time interval to do it.
well what I mean to say is that we can learn from Istio's implementation of http based fetch because even if not efficient I can tell it is in use. agree we won't get the SHA based stuff unless the other side supports ETag, but on the other hand building a solution that is more specific, but no-one uses, isn't helpful either.
let's do it easy then iteration.
I am renaming this to http fetch then :)
so here's the design proposal for HTTP (http:// or https:// urls)
The latter case is almost the same integration code as what OCI would need, just you don't need to use OCI to get the extra files. Note: istio also supports tar.gz, just it doesn't map any of the other files read-only to the guest.
Naive question: if we mount the ancillary files, what for will we be mounting? Will it be a virtual FS? Will wasm binary have access to itself?
@jcchavezs none of the questions were naive. In fact it took me days to figure out status quo on these topics, so glad you asked the questions I asked myself!
Naive question: if we mount the ancillary files, what for will we be mounting?
The only sensible mount is root imho as there's no configuration if you have only the tarball. This also cleanly maps to OCI which are mounting anything in layers are the root.
For example, a tar could include exactly the same as the python docker image like this, and it would be easy to say why.
$ ./car -tf ghcr.io/vmware-labs/python-wasm:3.11.3
python.wasm
usr/local/lib/python3.11/lib-dynload/.empty
usr/local/lib/python3.11/os.py
usr/local/lib/python311.zip
Will it be a virtual FS?
IMHO the first implementation should simply be in-memory virtual FS. It could be as simple as a fstest.MapFS
in implementation considering the component is alpha. Later, we can have separate PRs for more sophisticated stuff.
Will wasm binary have access to itself?
I can't in good faith implement that, so when I make the PR I will definitely remove it. I think crun are going to fix the OCI misunderstanding the same way. IOTW there's no reason for wasm to read (or write) itself, we are in control of the code, so we shouldn't introduce this bug.
@yaron2 sorry that issue didn't fix this. It changed the config to "url" but didn't add http fetcher yet
@yaron2 sorry that issue didn't fix this. It changed the config to "url" but didn't add http fetcher yet
It got closed automatically when the PR was merged. Reopened now
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.
Any reason to not to follows @codefromthecrypt's proposal? I think with a explicit consensus, implementation can follow.
@jcchavezs yep this is just awaiting implementation. I was waiting until after dapr 1.11 release to begin that.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.
still TODO
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as pinned, good first issue, help wanted or triaged/resolved. Thank you for your contributions.
Describe the feature
Today the WASM Middleware component can fetch warm files from a path: https://github.com/dapr/components-contrib/blob/master/middleware/http/wasm/httpwasm.go#L34
https://docs.dapr.io/reference/components-reference/supported-middleware/middleware-wasm/
But it will be great if we can use an OCI registry to fetch wasm files, so we can leverage package distribution, enabling developers to share their wasm files as container images.
I was looking at other projects, and something like the
oras-go
project can be used: https://github.com/oras-project/oras-go -> https://oras.land/client_libraries/0_go/Creating an image with a WASM file should be as simple as creating a Dockefile like this:
Ideas? Suggestions?
Release Note
RELEASE NOTE: