Open JohnTigue opened 1 year ago
Spinning up JupterLab on EC2 should be easy enough: How to Connect with Jupyter Server Running on AWS EC2. The port forwarding via SSH is clever and cute. But we'll just punch a hole through the firewall ("Security Group").
Got that running at http://54.203.116.198:8888/lab.
Was able to get to a terminal and curl models. Can also control docker compose up
.
We can script SD via Jupyter notebooks. For example we can come up with a test the runs say 20 prompts through a SD pipeline and show the results in an image grid. A good way for us to see what changes to a model do on a consistent test data set. "Reproducible research" is what the cool kids in science class call it.
I'll have to check if there are new tools. It's been almost two years since I was messing with running notebooks. This was king for a while: nbRunner.
I already have JupyterLab running but I want to get it cohabitating with InvokeAI inside ECS. This jupyter-ecs-service looks like nice work. I haven't used CDK yet but this might just be the use case that gets me to pull that trigger: jupyter-ecs-service.
This article, Deploy and run a Jupyter Lab server using Docker on AWS, is interesting in that JupyterLab is run on its own server and a private network is used to talk to other instances in the cluster. I had been trying to put everything on one instance (so they shared the same FS which contains the models and gens) but that is also why it was hard to isolate Auto1111 as the troublemaker. Perhaps I should keep separating all the players. This would require upping the Storage (#66) machinery… but it feels right.
Actually, separating different services to different containers is not only architectural cleaner (and in so doing prepares for the potential of splitting into two clusters connected by a message queue) but it would also make it much easier to develop locally (if there were a mock for the GPU machine; actually SD can run on CPU-only modes, which is slow but that would be sufficient for quick dev/test).
AWS is the only deploy target we are currently concerned with. This allows less work because we avoid having to build out "generic, heavy lifting" code. This plan is not changing at this time.
But IF BrainTrust were to become an open source project, folks would want to deploy it on, say, Kubernetes. In the article, Deploy and run a Jupyter Lab server using Docker on AWS, including things like nginx for load balancing is introduces. We are currently using AWS' ELB for that.
For the Workstation (kinda a single user instance but in the cloud) it is sufficient to just launch a single JupyterLab instance. But if this does actually get to a v2.x.x architecture of a GPU render cluster accessed via a message queue, or simply the team needs more individual user isolation, JupyterHub would be more industrial. That would be the cloud provider independent implementation. For some of that benefit but less work, SageMaker might be a good call.
(There is mounting evidence that the v2.x.x architecture of a GPU render cluster accessed via a message queue is really the way to go. Still trying to avoid doing that work up front. As long as it is just an internal tool… v1.x.x (Workstation) is, grumble-grumble, acceptable.)
Shoot, another use case of the render cluster would be to service a JupyterHub/SageMaker Studio deploy. And, again, if we took leadership of an open source API (essentially standardizing the existing codebases) then it could be made internally multi-user. Man, the more I think about it the more BrainTrust could really help Edam implement is strategy and he's supposedly kicking down money to bootstrap an open source ecosystem around StabilityAI world.
But that's not possible with the v1 Workstation architecture :(
Drat. The simple solution for jacking JypterLab into Docker turns out to be single user only. So, the hack version is to have JypterLab and Invoke in the same container? Phooey.
Yet another reason for the v2.x.x architecture involving a message queue. Oy.
Whelp, that is still a workable v1.x.x, just $$$ wasteful.
Renaming this issue from "JupyterLab" to "Jupyter" because the "Lab" is really only one person. The "Hub" is exactly what they intended for something in our use case. Unlike the hacky "can't we all just get along" sharing of the SD web-ui, this is worse: we're all sharing the same front end and the backend Python engine, which doesn't work. C'est la vie.
Once again this is begging for the v2 architecture. The JupyterHub containers could run on cheap VMs, and send off requests into the message queue for the render cluster to process.
But for now, since I'm the only one using Jupyter… good enough for experimenting with moving Colab notebooks to AWS. But if we get to where multiple notebooks encapsulate workflows we develop, then we really need to step up to JupyterHub. But Lab not Hub is good enough for the super short term.
An example of the above would be a model training process. We'll see about that soon enough. Looking forward to catching up with Anders. I've contributed nothing to his work except point at the gDrive where the 3D models are. I have no idea how he's doing training. First experiments don't need Jupyter for automation, yet.
This is actually a cool idea for how to implement one part of the v2.x.x architecture. This would be a DIY multi-user JupyterHub environment with the least admin hassle on AWS: Serverless Jupyter Hub with AWS Fargate and CDK. Of course, perhaps one should just use SageMaker at that point. Neither solutions is relevant to v1.x.x. though which is really just SD desktops running in the cloud…
But if we were to open source BrainTrust…
Third time I've seen this sort of thing. An old-school drawing tool, side by side with SD output, and synced somehow (in this case by Jupyter): Who needs Photoshop anyway?? MS Paint + SD AUTOMATIC1111 API + JUPYTER.
Seems there are lots of Jupyter notebooks for making vidoes. Lots of folks even use Colab for free to pull it off. For example, published last month is A Free And Easy Way To Make AI Videos With Stable Diffusion.
For reboots, would be nice to have the container start on reboot, a la:
sudo docker run -d -p 8888:8888 jupyter/scipy-notebook
Drop the sudo
, natch.
It's just a half-assed solution, but enabling a password will be easier to use than the rando token set-up currently running. (The token changes every reboot.)
Oh, why, hello there: jupyter/base-notebook
Don't mind if I do.
That does:
# Install Jupyter Notebook, Lab, and Hub
And has a provision for a config file:
COPY jupyter_notebook_config.py /etc/jupyter/
Ah, it's "Base image for Jupyter Notebook stacks from https://github.com/jupyter/docker-stacks"
Core Stacks The Jupyter team maintains a set of Docker image definitions in the jupyter/docker-stacks GitHub repository. The following sections describe these images, including their contents, relationships, and versioning strategy.
Nice:
Options for a self-signed HTTPS certificate
I've been using jupyter/scipy-notebook from my nueroscience work but sounds like that's overkill – we just need jupyter/minimal-notebook.
To run the notebook server with a self-signed certificate, pass the --secure option to the up.sh script. You must also provide a password, which will be used to secure the notebook server. You can specify the password by setting the PASSWORD environment variable, or by passing it to the up.sh script.
PASSWORD=a_secret notebook/up.sh --secure # or notebook/up.sh --secure --password a_secret
So, these Docker Stacks I've decided to use seem to want the config file to be a sibling of the Dockerfile:
# Currently need to have both jupyter_notebook_config and jupyter_server_config to support classic and lab
COPY jupyter_server_config.py docker_healthcheck.py /etc/jupyter/
OK, there are two roles that Jupyter is playing
For #1, Jupyter needs to be on the instance if it going to provide FS navigation.
For #2, if a "foreign" Jupyter can use the GPU, that would be sophisticated in terms of architecture. But what does that mean? Jupyter is using some web-ui embedded in it to provide UI? This sounds more like a v2.x.x thing where the notebook server would call on some brain_trust API to crunch data.
Also, I cannot believe I only thought of this now, but the terminal inside Jupyter is the more appropriate way to get folks like Anders access to the command line. Derp.
We've been collection SD centric Jupyter Notebooks(#50). We should have machinery to run them. JupyterLab and SageMaker(#27) are the obvious choices. We already have evidence that Invoke CLI can work with Juypter: InvokeAI repo, Stable_Diffusion_AI_Notebook.ipynb