Open mreferre opened 2 years ago
Thanks for the issue and discussion points, @mreferre. Indeed, as you've seen (https://github.com/fermyon/installer/issues/45 and others), we're continuing to explore approaches that lower the barrier of entry for running the Fermyon platform. Wanted to chime in with a little background and then hopefully the discussion can start to grow around the thoughts/suggestions you've mentioned.
I don't know if this a choice you have made for initial practical reasons or for more strategic reasons in terms of where you want to bring the tech.
A little bit of both. Nomad, being our scheduler of choice for the platform, is intended to run directly on the host. In adding the various other moving bits on top of this foundation (i.e. Bindle, Traefik, Hippo), all in the form of Nomad jobs, we stuck with the same approach (either installing them onto the host directly or having Nomad fetch binaries as needed). There is definitely a strategic reason, at least in one main area, for this approach as well: to harness the core benefits that we hope to deliver with the platform (speed, efficiency, scale-to-zero), we intend for the Spin apps themselves to run on the host (i.e. on a registered Nomad worker) directly.
All of that being said, we're definitelly open to exploring ways in which containerization might help users wishing to run the platform have a better go at it. Although the effort to run Nomad itself in a container doesn't appear immediately trivial, the other platform infra components (Bindle, Traefik, Hippo) and perhaps even Spin itself (i.e. the spin process running a given app) could certainly be eligible, as long as the platform (and the value it provides) isn't adversely affected by the introduced complexity and/or overhead of containerization.
Thanks @vdice for the background. I love this discussion.
I should have made it more clear that my appetite for using containers in this very specific use case is very much geared towards a better packaging mechanism and not so much around "containers for microservices". In other words I see containers being used here more like "glorified VMs" (here I said it!). It just feels to me that attacking your operational challenges discussed above with OCI images and Dockerfiles is easier than doing so with AMIs and user_data (I am trivializing it). So yes in a way it's about introducing another compute abstraction (containers on top of VMs) but I sense that the outcome is net positive and the risks of "the platform (and the value it provides) isn't adversely affected by the introduced complexity and/or overhead of containerization" are minimal and/or dwarfed by the operational benefits that containerization can introduce.
What I was envisioning was containerizing Bindle, Traefik, Hippo, Spin and every other process you would need to run so that these containers can be scheduled just like how you are scheduling today the binaries on the EC2 instances.
In the past few weeks I have started to play with Spin (prior to the platform announcement) and after playing with a VM deployment I found it so much easier to package everything into a container and use it that way. This is the Dockerfile
I came up with (far from being optimized):
FROM ubuntu:latest
# this ARG is required for cmake not to prompt for a geo
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt install curl jq git cmake zip -y
SHELL ["/bin/bash", "-c"]
RUN LATEST=$(curl -s https://api.github.com/repos/fermyon/spin/releases/latest) \
&& X86URL=$(echo $LATEST | jq -r '.assets[].browser_download_url' | grep linux-amd64.tar.gz) \
&& X86ARTIFACT=$(echo $LATEST | jq -r '.assets[].name' | grep linux-amd64.tar.gz) \
&& curl -L -O $X86URL \
&& tar -zxvf $X86ARTIFACT \
&& mv spin /usr/local/bin/spin
RUN curl https://sh.rustup.rs -sSf | bash -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
RUN rustup target add wasm32-wasi
EXPOSE 3000
# this assume that files/artifacts are mounted on the /wasm folder
CMD spin up --listen 0.0.0.0:3000 --file /wasm/spin.toml
This is a super small PoC I put together off the back of the above container where I would launch a couple of AWS Fargate tasks connected to a shared EFS volumes mounted on /wasm/
:
The CodeBuild project just builds Wasm artifacts and place them in a shared EFS volume that the 2 Spin containers go grab. By the way this is why I asked for hot reload support in Spin here (I'd like to update the application without having to restart Spin - albeit perhaps if I use Bindle/Hippo I won't need that because of the way applications get updated) and Arm support here (Fargate ships with support for Linux/x86, Linux/Graviton and Windows/x86).
When I saw the Fermyon platform announcement I have seen similar operational issues that I have observed with Spin.I intended to play around the individual pieces and containerizing them just like I did with Spin but I haven't had time to do so just yet (it also feels a bit more challenging to stitch together 4-5 pieces rather than just one and I did not want to end up in an "experiment cul-de-sac" if you were not open to go down this route).
I would really be happy to collaborate and build a PoC together on the scenario above (if you feel like it).
@mreferre very cool, thank you for sharing!
What I was envisioning was containerizing Bindle, Traefik, Hippo, Spin
I think images for these would definitely prove useful to developers more comfortable with the container workflow (and/or as an alternative mechanism to setting up their host machine directly). As you've mentioned, the applicable Nomad jobs can be updated to use the docker
driver. If this provides a convenient (and/or more approachable) way for users to test out and try the platform, this will be excellent! This might be a good start to explore how the platform runs/performs using this technology.
We're definitely always curious to learn about the creative ways engineers come up with to use and run Spin, such as the intriguing AWS setup you've illustrated. If you'd be up for it, I bet it would be a welcome demo in the Spin Developer's Meeting!
Cool. Thanks @vdice. Let me see if/how I can put together something. I'll take there will be someone being able to answer questions if I hit a roadblock along the way :)
@vdice I was finally able to build this science project (with a slightly expanded scope to include Lambda). I should actually add a small section on how to run the same container locally.
I am going to disguise a proposal in a question. It looks like you are choosing a VM (or a laptop OS) as the compute unit for running the Fermyon platform. I don't know if this a choice you have made for initial practical reasons or for more strategic reasons in terms of where you want to bring the tech. However, the more I play with Wasm (and Spin and now the Fermyon platform to an extent) the more I find it way more convenient to package the compute unit as a container image (and execute it as a container). First and foremost there appears to be dependencies with things like
spin
that are easier to solve with a container than a VM (e.g. I can't run spin on AL2 due to theglibc_2.27
requirement). There seems to be practical requirements (https://github.com/fermyon/installer/issues/45) that would be easily resolved with a container sandbox. A containerized setup would also provide "for free" a modern horizontal scaling mechanism (https://github.com/fermyon/installer/issues/63). A containerized deployment would give the user more options to deploy the Fermyon platform (docker compose locally, ECS/Fargate on AWS, Nomad for a more infrastructure agnostic deployment mechanism - Nomad supports containers in addition to Linux processes so you don't necessarily need to ditch Nomad from your stack).Right now you seem to have a series of bash scripts you run to install locally (or as user data for setting up on EC2). What I am thinking about would be to turn all that logic into a
dockerfile
to build images that you could then (way more) easily instantiate locally or in the cloud with minimal IaC differences. In the past few weeks I was able to build a small PoC of a containerized version of Spin running on ECS/Fargate that worked very well and there would be no reason why one couldn't be able to run the whole Fermyon platform on the same stack if there is that appetite.Everything I have said would also apply to K8s (but I am not sure if K8s is the platform you see the Fermyon platform users wanting to deploy on).
Is this move from a VM compute unit to a container compute unit something that you are considering? Or does it crash with how you see Wasm evolving over time where the VM (the entire OS) should be the shell where you run Wasm code? There is quite a bit of philosophy into this question/proposal, I know :)