Closed Bjarke42 closed 8 months ago
The new volume should also be used in the other environments development
and development_gpd
looks good to me!
The read_only: true
could even be removed I guess
Thanks for the contribution.
I wonder if it would make sense to integrate a generic overrides
method here in docker-migrid
and to all use that ... @Bjarke42 ?
I'm asking because we currently use slightly modified docker-compose-X.yml
files linked from the docker-migrid-ucphhpc
repo for our own minor adjustments for our sites e.g. to allow podman use, and already had vgrid volume adjustments overlapping with this PR there.
Thanks for the contribution. I wonder if it would make sense to integrate a generic
overrides
method here indocker-migrid
and to all use that ... @Bjarke42 ? I'm asking because we currently use slightly modifieddocker-compose-X.yml
files linked from thedocker-migrid-ucphhpc
repo for our own minor adjustments for our sites e.g. to allow podman use, and already had vgrid volume adjustments overlapping with this PR there.
this is the docker-compose.override vi currently use: docker-compose.override.au.yml.j2 Its a nice way of not having too many instances of the same thing, but just as you are talking about having one docker-compose. But i do not know your requirements, so if you only change stuff in the already existing parameters, then you could benefint from it alot
For the record, this PR broke the Test CI as you can see here: https://github.com/ucphhpc/docker-migrid/actions/runs/8345752910/job/22841385858?pr=45#step:6:35
I cannot see though why it was not run for this PR. I open a Issue for that.
I can confirm that using the docker-compose_production.yml
from the merged revision 9e5d09dfd67749f6470386f615341bad647901be breaks docker-compose up
on my dev erda instance with the same error as the CI builder reported in the link @benibr posted above:
Creating migrid ... error
ERROR: for migrid Cannot create container for service migrid: failed to mount local volume: mount ./state/vgrid_files_writable:/var/lib/docker/volumes/docker-migrid-erda_vgrid_files_readonly/_data, flags: 0x1000, data: ro: no such file or directory
...
It runs fine using the previous reversion b79a88238ab52d1212487e1a86a4bb134968cf7f of that same compose file.
The initial crash could be removed by removing the space in the volume mount options and switching to the new docker compose
as the default compose command rather than the old docker-compose
.
Even then we kept hitting random mount errors e.g. in the Github Actions CI job for such nested volume remounts. This looks like a potential race in docker itself as @benibr reported upstream in https://github.com/docker/compose/issues/11663
Still, AFAICT there is no point at all in having this vgrid_files_readonly
bind mount as a full-blown docker top-level volume as we just need a live read-only copy of the vgrid_files_writable
dir. The same applies for the mig_system_run
shared cache space referenced in issue #40. I understood from PR 43 that it was changed into top-level volumes for easing docker-compose overrides at AU, which sounds valid enough. So I've reworked them into plain bind mounts but tried to provide similar flexibility with env variables to allow overriding the actual source path and using the long format to hopefully ease any remaining templating needs.
The result is now available in the docker-compose_production_bind.yml
file and I'd like your feedback @Bjarke42 and Anders. Will that or can that be made to work with your site setup?
It finally brings consistency back in the build everywhere else so I'd really like to have it replace docker-compose_production.yml
.
@jonasbardino can you please add a PR for this so that we can see the diff and have a discussion there?
Please do not commit testing configs to the master branch. We have so many unused config files in Migrid and it just makes things confusing. Also it's very complicated to make a test deployment when you have to change a file somewhere in the middle of it. Use git branches plz :-)
@jonasbardino yes, please continue this in a new PR. Its alot easier for us to see the diff and then relate to what you changed.
You're probably right that it would have been better to either make this as a branch in the first place or to initiate a full revert of this PR with a follow-up retry if needed. Either would introduce breakage or extra work somewhere. As you know it developed in several steps during that bug hunt, because Github Actions are tricky to simulate/debug and I was hoping we could quickly resolve the matter and just replace the existing production conf. In hindsight that was maybe too optimistic, since we all have other urgent matters to tend to.
Anyway, I've made a PR #55 to do what I intend to do. Namely, replace the current docker-compose_production.yml
with the proposed new docker-compose_production_bind.yml
. Please have a look and perhaps review to confirm or raise any remaining issues with it.
I hope it makes it easier, although I'm not really sure it is less pain to test from a branch instead... I'll let you decide on that :-)
The way the vgrid_files_readonly was implemented does not honor the way we use docker-compose.override in ansibl-docker-migrid, so now i have changed it so i can override the mountpoint correctly.