zinc-collective / convene

An Operating System for the Solidarity Economy
https://convene.zinc.coop
Other
56 stars 19 forks source link

🛠️ `Neighborhood`: Use own devcontainer and persist user data #1256

Closed daltonrpruitt closed 1 year ago

daltonrpruitt commented 1 year ago

Two main goals:

  1. Increase the speed of codespaces startup, as it seems to rebuild the container every time. Includes pre-installed gems.
  2. Persist user-added data (i.e. that added to databases) between CodeSpace stops and starts.
daltonrpruitt commented 1 year ago

Some discussion needs to happen for this, so I figured I would open this for now. I'll try to collate my questions sometime in the next few days. Please ping me if I forget, or feel free to add comments to this PR.

zspencer commented 1 year ago

Yea, there's definitely some complexity here. For example:

  1. Where do we publish these docker images to?
  2. How do we decide when to publish a new image (I could get behind monthly or when there is a diff between the Gemfile.lock and yarn.lock)
  3. How do we clean up old images that don't matter anymore?

But I'm curious about what questions you've got!

daltonrpruitt commented 1 year ago

Thanks for the reminder!

  1. I'm thinking DockerHub may be a safe bet, mainly becuase they have unlimited public repos for free in the Pesonal/open-source community plan. However, there are other options, like Canister and others listed on various articles. To make the "best" choice would require more deliberation and weighing of options, but I'm feeling DockerHub would be a safe bet given the fact that GitHub's explanation of publishing images via GitHub Actions shows publishing to DockerHub and GitHub Packages explicitly.
  2. I was working on a new GitHub Action workflow that would take care of building and publishing the image (using instructions from the GitHub explanation above). It is not in the state I think it should be now, but I believe the image should only be rebuilt if any of the following files changes: Gemfile, Gemfile.lock, and the Dockerfile itself. I don't currently do any yarn install in the Dockerfile, but we can get that in there and add yarn.lock to the relevant files.
  3. I currently have absolutely no idea. I haven't looked into DockerHub (or any other image registry) to see what options they supply. I would assume there would be a way to delete images older than X days/months or something, or only keep the most recent version of an image with the same tag (if we wanted to have a latest available that was just a tag that moved up like a git branch reference when adding commits). I think this is less of an immediate issue, but definitely needs to be discussed when we finalize where it can be hosted.

    I think that covered most of my questions. I was mainly concerned if there was a reason to not build only when certain files are changed, but you're question 2 makes me think that should be fine.

My current biggest issue is that when I start testing this, I'm going to have to test the workflow not just on "master" but on this branch as well. Just a thing I need to remember to remove before finalizing the PR.

Actual question:

  1. Do we want to make a convene account to host this container (on whichever registry we go with)? I can start it with my own until we get the kinks worked out, but I do figure it needs to eventually go into a more shared account or something. I'm not sure how communities typically do this on DockerHub, but I haven't looked into yet. Just wanted to get someone's thoughts.
zspencer commented 1 year ago

It looks like Github Packages provides a repository for publishing docker images.... I wonder if we could start out by trying to publish to that?

daltonrpruitt commented 1 year ago

Ah, the info I had read earlier said there was a limit for free tier was 500MB of storage, so I didn't consider it much further. Reading the link you posted, I think what I read may have been for private stuff only.

I will look more into this!

zspencer commented 1 year ago

How big is our dev container image?! 😱

daltonrpruitt commented 1 year ago

I’ll have to check later, but I would guess 0.4-1.5 GB. Not sure how to check except for docker cli or the desktop app.

daltonrpruitt commented 1 year ago

How big is our dev container image?! 😱

Based on a build I just tried, 2.15 GB...

daltonrpruitt commented 1 year ago

SUCCESS! (at least on my fork for now)

This was fairly simple. Documentation doesn't seem to be fully correct anymore, and part of what I am doing is apparently already deprecated, but it's working for now with the versions of the actions I am using.

Next step is to update the image used in our codespace to convene-devcontainer:latest and see how that goes. I will again have to test it on my own version for now, but 🤞

zspencer commented 1 year ago

Very exciting!!!!

daltonrpruitt commented 1 year ago

See changes involving symlinks attempting to fix data persistence problems across Codespace restarts.

The problem seems to have been that anything outside of /workspaces is yeeted on Codespace stop (to minimize storage requirements). This should force the db containers to save the data within the /workspaces directory. \ This may increase storage costs (or decrease amount of active development time available in a month) depending on the db sizes and length of time the Codespace is kept alive (not active, but the full time between it's first start and it's actual deletion/removal).

zspencer commented 1 year ago

ooooo! When do we get the new fancy?! Is there anything that is blocking merge?

daltonrpruitt commented 1 year ago

Currently in the process of testing data persistence between startups (not creation). Can't seem to get it to behave the way I want, so I took a break. Hoping to look at it more this afternoon.

If anyone knows more about the inner workings of the redis and Postgres containers, please feel free to look at the devcontainer startup script and see if I'm doing something obviously wrong.

daltonrpruitt commented 1 year ago

Unfortunately, still playing around with permissions in the data files used by the redis container. Not sure what the problem is exactly, so I've got more work to do. \ Ultimately, this PR is "finished" as intended: this publishes and uses our own devcontainer image (though currently is from my repo...). So, we could accept as-is and polish this other stuff afterwards. However, as it currently stands, the startup is definitely broken (due to some recent changes), so I would advise against that. I could branch off a few commits back and just force push a working version of this branch to have something we can temporarily use, but I'd prefer to finish this string of work before calling this PR "done".

zspencer commented 1 year ago

No worries, if you want I can try and take a look at it with ya; or I can let it ride! (I don't really know much about codesapces so I don't know how helpful I will be but sometimes a second pair of unix-eyes can help)

daltonrpruitt commented 1 year ago

Currently still dealing with a permissions issue. I think I may actually have to read some documentation on how the redis container does things on startup. It may be a docker thing, but I'd have to research to say more. Something about user 999 being the owner of the redis volume's _data folder caught my eye, so I think it's what I'll look into. Hoping to look at during downtime on vacation. Will try to keep things updated here.

Also, tried to separate out the data-persistence-on-startup specific changes from the use-our-own-devcontainer changes, but git didn't behave how I expected, so I'll have to try that again later (we can just drop the image publishing as it doesn't add much, but I'll have to do some timing to confirm).

zspencer commented 1 year ago

It's a very exciting patch!!! Thank you!

daltonrpruitt commented 1 year ago

Just wanted to provide some kind of update.

Currently have kind of "fixed" the permissions issue on the redis container volume directory in /workspaces. I could actually see a space I made persist between stopping and starting the same CodeSpace. Progress!

However, this was mostly trial-and-error and searching StackOverflow and GitHub issues, and the fix is a bit flaky (I originally had set ownership to one uid, which worked, and then retested and had to set to a slightly different one, 999 vs 1001). I have opened several sites to read about the actual usage tips/tricks for the redis/postgres containers, but I have not done this deep dive yet and it is bedtime.

zspencer commented 1 year ago

Ugh permissioning is such a challenge... But it's exciting that you're seeing progress!

Part of the pain of this kind of change is it's very much a thing that has to be ran from beginning to end in order to really validate it; so I appreciate you taking the time to keep hammering away at it.

daltonrpruitt commented 1 year ago

For some reason, the permission problem I was having (redis container couldn't access the docker volume and so would fail to start) is no longer an issue. The data does persist! And using the new self-published devcontainer is much faster on startup (anecdotally) (still have to update to use convene-published version instead of in my packages).

However, another permissions issue cropped up on a specific file (dump.rdb in redis data volume): Once in redis's log file during bin/setup:

Failed opening the RDB file dump.rdb (in server root dir 
/bitnami/redis/data) for saving: Permission denied

Every few seconds during bin/run:

MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error

I am now taking a more big-picture approach and just symlinking the entire /var/lib/docker/volumes directory directly into /workspaces (as opposed to the individual volume dirs), to hopefully have a more coarse approach. Testing this at the moment.

daltonrpruitt commented 1 year ago

Alrighty. It seems to be working now! No more permissions issues that I can find.

Three problems still remain:

  1. There is still the problem that the actual unit tests fail for some reason... (most important to me, but not fully investigated)
  2. I still need to change the container used to be the convene one instead of mine (could leave for now for stability?)
  3. The next time the devcontainer is built (or really anyone uses the old version without our own devcontainer), it will not work with ruby, as the ruby-3.1 base image now uses 3.1.4, but we specify 3.1.3 in .ruby-version. I ran into this issue when testing on main.
zspencer commented 1 year ago

Thank you!

Re: Failing unit tests - CI should be the arbiter of test pass/fail, so long as the CI is green we can work on "why does this particular test fail in the devcontainer?!" on a case-by-case basis. Re: Moving to convene-namespaced image - I believe you'll want to push this branch to the zinc-collective/convene repo to make that work. You should be able to do that as you have contributor status which grants you push-rights Re: Ruby image - I'll make a quick PR to bump our ruby-version to 3.1.4; since we want to always be on the latest 3.1, but there isn't a way to specify that in the .ruby-version or .tool-versions syntax (AFAIK, maybe @anaulin knows a way!)