Closed jsturtevant closed 7 months ago
fyi @cpuguy83
I think we need to add some GC labels so containerd does not collect this content. See: https://github.com/containerd/containerd/blob/8459273f806e068e1a6bacfaf1355bbbad738d5e/docs/garbage-collection.md#garbage-collection-labels
From my experimenting, the GC is looks to be a top down model. So I've put the label that retains on the image level: https://github.com/containerd/runwasi/pull/405/files#diff-74fa6f2b9f387ae0683336f3c04b7e15658dbb2c2820f74bd143894009593d1bR365-R368. If the image gets removed this will be removed as well.
log::debug!("updating content with precompile digest to avoid garbage collection");
let mut image_content = self.get_info(digest.clone())?;
image_content.labels.insert("containerd.io/gc.ref.content.precompile".to_string(),precompile_digest.clone(),);
How do we invalidate (and de-reference in containerd) pre-compiled modules? As I understand it a pre-compiled module will not work across different versions of a runtime.
Also we likely need to annotate what runtime the content is for.
I did this here, but as you point out might need to do this per version as well. That way if a runtime is upgraded it would re-compute.
let label = format!("runwasi.io/precompiled/{}", T::name());
For dereferencing, do you think it is enough to wait for the image to be removed, which would drop the couple versions (if there are any)? I don't think it will be too much space to have a couple versions of it sitting around vs complexity to dereference them on an ongoing basis. For example, during the upgrade of the shim I think there might be alot of edge cases.
Also, when writing the content we'll need to attach a lease to it until gc labels are applied otherwise containerd will collect it.
Ah, I was wondering about that! There did seem to be a small moment where it could be cleaned up before I put the label on the image. I will take a look at putting a lease on it.
@jsturtevant Started reviewing this PR. How do I run this to see if the modules are stored?
The shim logs have will have statements and you will also be able to verify in containerd using ctr
:
Find the compiled info:
sudo ctr i ls | grep "runwasi.io"
ghcr.io/containerd/runwasi/wasi-demo-oci:latest application/vnd.oci.image.manifest.v1+json
sha256:60fccd77070dfeb682a1ebc742e9d677fc452b30a6b99188b081c968992394ce 2.4 MiB wasi/wasm
runwasi.io/precompiled=sha256:b36753ab5a46f26f6bedb81b8a7b489cede8fc7386f1398706782e225fd0a98e
See it in the content store:
sudo ctr content ls | grep "b36753ab5a46f26f6bedb81b8a7b489cede8fc7386f139870"
sha256:60fccd77070dfeb682a1ebc742e9d677fc452b30a6b99188b081c968992394ce 561B 2 months containerd.io/gc.ref.content.0=sha256:a3c18cd551d54d3cfbf67acc9e8f7ef5761e76827fe7c1ae163fca0193be88b3,containerd.io/gc.ref.content.config=sha256:85b7f2b562fe8665ec9d9e6d47ab0b24e2315627f5f558d298475c4038d71e8b,containerd.io/gc.ref.content.precompile=sha256:b36753ab5a46f26f6bedb81b8a7b489cede8fc7386f1398706782e225fd0a98e
sha256:b36753ab5a46f26f6bedb81b8a7b489cede8fc7386f1398706782e225fd0a98e 626.4kB 3 days runwasi.io/precompiled=sha256:60fccd77070dfeb682a1ebc742e9d677fc452b30a6b99188b081c968992394ce
save it to local file:
ctr content get sha256:b36753ab5a46f26f6bedb81b8a7b489cede8fc7386f1398706782e225fd0a98e > precompiled.wasm
Also, when writing the content we'll need to attach a lease to it until gc labels are applied otherwise containerd will collect it.
Ah, I was wondering about that! There did seem to be a small moment where it could be cleaned up before I put the label on the image. I will take a look at putting a lease on it.
Added a lease during new content creation
Another question I had was if this is something we want to turn on for all images (as the code is now) or if we want to put an annotation on the image that would enable this feature.
@CaptainVincent does WasmEdge Rust SDK have an API to AOT compile a Wasm module to bytes that's similar to https://github.com/bytecodealliance/wasmtime/blob/7690c500228ae005fc70ee3b6297bb4b403686d5/crates/wasmtime/src/engine.rs#L222-L223?
Another question I had was if this is something we want to turn on for all images (as the code is now) or if we want to put an annotation on the image that would enable this feature.
I've thought on this a bit more and I think it should be opt in feature so will add support for an annotation
I don't think an annotation on the image would work well. A runtime annotation could work but... I'm curious why do you think it should be opt in? I can see the desire to opt in for now because its kind of experimental, but then the whole stack is.
We must compile the wasm to run it at all, so we already need to store it. Keeping it cached so the same wasm isn't compiled more than once is just an optimization.
Yes. This lines of the code provides two methods about bytes to file
and file to file
. If needed, I can also add bytes to bytes
inside wasmedge-rust-sdk or integration layer.
@CaptainVincent does WasmEdge Rust SDK have an API to AOT compile a Wasm module to bytes that's similar to https://github.com/bytecodealliance/wasmtime/blob/7690c500228ae005fc70ee3b6297bb4b403686d5/crates/wasmtime/src/engine.rs#L222-L223?
I can see the desire to opt in for now because its kind of experimental, but then the whole stack is.
We must compile the wasm to run it at all, so we already need to store it. Keeping it cached so the same wasm isn't compiled more than once is just an optimization.
I was leaning towards it being experimental but I can keep it as.
Honestly it took me some time to grasp the interaction between containerd and the sandbox for saving / retriving precompiled content using leases. I am still not 100% sure the full picture of the interactions. Would be appreciate if you clarify the logic here (maybe document it somewhere?)
good idea! I've added some documentation, you can see the rendered page by viewing the markdown: i.e: https://github.com/containerd/runwasi/blob/0135abace198e39a1f4f6aae6ae74605619e42b2/docs/oci-descision-flow.md
Tested this out.
container.Image
which is how you are traversing the content. I have a patch prepared (for docker) to remedy this, but it would only work when docker is using containerd-backed storage.container.Image
is not the best data to use... we should add a new field to the containerd protos (in containerd/containerd) for an immutable reference to the image manifest being used... for now we'd need to work with what we've gotOn the Docker usage, I've noticed some peculiar things with runwasi. The first container start is quick (considering it is compiling the content). Then subsequent starts of the same container are actually slower. Removing the container in docker and recreating gives a faster startup (even faster since it has the pre-compiled content) and then the 2nd start again slows down (by an order of magnitude here).
- This doesn't work with Docker. Not your fault since Docker is not filling in
container.Image
which is how you are traversing the content. I have a patch prepared (for docker) to remedy this, but it would only work when docker is using containerd-backed storage.
Thanks!
container.Image is not the best data to use... we should add a new field to the containerd protos (in containerd/containerd) for an immutable reference to the image manifest being used... for now we'd need to work with what we've got
Created an issue (#460) to look into this and possibly use something from the conatiners snapshot as you suggested offline. I think we can improve the overall design in way Containerd passes WASM info but waiting on the discussions in the CNCF wg-wasm in regards to an offiical OCI Artifact for wasi.
Removing the pre-compiled content seems to make the container refuse to ever start until we wipe out the image and start over.
This is a bug! I will fix this and look at adding tests to cover this scenario more.
On the Docker usage, I've noticed some peculiar things with runwasi. The first container start is quick (considering it is compiling the content). Then subsequent starts of the same container are actually slower. Removing the container in docker and recreating gives a faster startup (even faster since it has the pre-compiled content) and then the 2nd start again slows down (by an order of magnitude here).
This is interesting since I am seeing an improvement in speed with the changes when using ctr (https://github.com/containerd/runwasi/pull/405#discussion_r1465444917). Does the slow behavior for the second start replicate with the non-oci image?
You might want to include Engine::precompile_compatibility_hash
in the cache key; I added this method for precisely this purpose :slightly_smiling_face:.
Precompiled binaries are sensitive not just to runtime version but also CPU and engine config features. It doesn't seem entirely implausible that - for example - the content store could be migrated to a new host with different CPU features and then you'd have a very gnarly performance regression to debug...
quick update, When I've added the tests it seems to have exposed a possible bug which I am debugging now. Essentially the OCI test run fine but when run with the other tests they are hanging and failing if run after an OCI test. We create a new engine each time so I am not really clear on what is causing the issue yet.
quick update, When I've added the tests it seems to have exposed a possible bug which I am debugging now. Essentially the OCI test run fine but when run with the other tests they are hanging and failing if run after an OCI test. We create a new engine each time so I am not really clear on what is causing the issue yet.
I tracked this down to the call to parallelize the compile step here. The engine should be usable across threads and each test should be creating a new "engine" so I don't know why this happens.
https://github.com/bytecodealliance/wasmtime/blob/b6a8abc6ce31cf6db7ad5edb9b2567620190ce35/crates/wasmtime/src/engine.rs#L142
quick update, When I've added the tests it seems to have exposed a possible bug which I am debugging now. Essentially the OCI test run fine but when run with the other tests they are hanging and failing if run after an OCI test. We create a new engine each time so I am not really clear on what is causing the issue yet.
I tracked this down to the call to parallelize the compile step here. The engine should be usable across threads and each test should be creating a new "engine" so I don't know why this happens. https://github.com/bytecodealliance/wasmtime/blob/b6a8abc6ce31cf6db7ad5edb9b2567620190ce35/crates/wasmtime/src/engine.rs#L142
Did we turn on the "parallel-compilation" feature of wasmtime?
Did we turn on the "parallel-compilation" feature of wasmtime?
yes we turn it that feature https://github.com/containerd/runwasi/blob/ea8c04f5d8073119d6b38f360311a8ec52dc9911/crates/containerd-shim-wasmtime/Cargo.toml#L23 and it looks to be avaliable on my machine as well as on CI
My next step is to make a small/simpler reproducible example
I was able to create a simple repo: https://github.com/jsturtevant/rayon-clone-issue/blob/master/src/main.rs. After adding the unit tests I am effectively running into https://github.com/containerd/runwasi/issues/357 which is also called out in https://github.com/rayon-rs/rayon/issues/708#issuecomment-555699090
You might want to include
Engine::precompile_compatibility_hash
in the cache key; I added this method for precisely this purpose 🙂.Precompiled binaries are sensitive not just to runtime version but also CPU and engine config features. It doesn't seem entirely implausible that - for example - the content store could be migrated to a new host with different CPU features and then you'd have a very gnarly performance regression to debug...
I've updated wasmtime to use this hash for the label
I was able to create a simple repo: https://github.com/jsturtevant/rayon-clone-issue/blob/master/src/main.rs. After adding the unit tests I am effectively running into #357 which is also called out in rayon-rs/rayon#708 (comment)
I've added the ability to disable the parallel compilation for the tests. We won't run into this issue as is in the shim since each shim is it's own instance but we need to address it longer term with #357
test failure was #423
@cpuguy83 I've addressed the feedback, could you take another look? Thanks!
🥳 Awesome! I am going to do a new release
this would replace #44