Closed yarikoptic closed 1 year ago
Dear @yarikoptic,
This would be amazing and super useful! Happy to help with this after the holidays :)
Correct, we build docker containers using the github actions you identified. Then we store them in the gihub container registry and in the docker registry. Another action converts these docker images to singularity and uploads them to the object storage in Ashburn. Since I live at the other side of the internet (Australia :p) this object storage is then mirrored to other locations (Frankfurt, Sydney, Brisbane). We then pull these singularity files using aria2c from all locations simultaneously so that everyone gets a good download speed no matter where they are.
container_pull="aria2c https://objectstorage.us-ashburn-1.oraclecloud.com/n/sd63xuke79z3/b/neurodesk/o/$container https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/sd63xuke79z3/b/neurodesk/o/$container https://objectstorage.ap-sydney-1.oraclecloud.com/n/sd63xuke79z3/b/neurodesk/o/$container https://swift.rc.nectar.org.au/v1/AUTH_dead991e1fa847e3afcca2d3a7041f5d/neurodesk/$container"
To avoid people having to download the whole singularity container we do another trick: We unpack the singularity images in a sandbox and store these on a CVMFS Stratum 0 Server hosted in Brisbane. This server then mirrors to Stratum 1 Servers distributed in the US, Canada, Europe and Australia and people can directly mount this in user space and directly access all containers :)
Have a wonderful holiday season and very much looking forward to help getting our containers in datalad run :)
Cheers Steffen
Dear @yarikoptic,
Just wondering what I can do to help get to get our containers into datalad :) Can datalad map the different download sources of our containers and download them from multiple sources or would it help to have one domain that get routed based on DNS to the closest download location?
Cheers Steffen
Hi @stebo85 , thanks for the great talk you gave yesterday -- exciting to see how many "moving parts" are in place and your live-demo gods favor you so much that it all went so smooth. I think it is ripe time to get back to this little subproject, and may be @asmacdo would also join to help eventually. Meanwhile let me answer your question:
Can datalad map the different download sources of our containers and download them from multiple sources
yes, git-annex (used by datalad) can add arbitrary number of URLs for any given file. Since recent version we can also give priorities to different sources (ref: https://git-annex.branchable.com/todo/Allow_for_URLs_prioritization_WITHIN___40__web__41___remote/#comment-3ed32f70bcdd682c3d3d41f5541f9593 and therein, sample repos with such setting: https://github.com/dandisets/).
or would it help to have one domain that get routed based on DNS to the closest download location?
wouldn't hurt since would make it dynamic (no need to hardassign costs) but not required.
What would help me to get started is probably get a pointer to some datastructure listing all singularity containers with some urls on where to get them from.
As for docker -- we never figured a way to store images with references to individual layers: https://github.com/datalad/datalad-container/issues/98 . may be should look into it again at some point.
Dear @yarikoptic,
Thank you :) I am very excited to make progress on this - would be so cool to have our containers available through datalad =)
Then let's start simple:
This list contains all current and tested container releases (automatically updated by our CD): https://github.com/NeuroDesk/neurocommand/blob/main/cvmfs/log.txt (we also added the categories, because we use this file to build our application search: https://www.neurodesk.org/applications/)
so for example the first entry is: container=afni_21.2.00_20210714
The container name from the list then only needs an ".simg" attached and the URL for the 3 main mirror servers in APAC, EUROPE, AMERICAS becomes: https://objectstorage.us-ashburn-1.oraclecloud.com/n/sd63xuke79z3/b/neurodesk/o/${container}.simg https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/sd63xuke79z3/b/neurodesk/o/${container}.simg https://objectstorage.ap-sydney-1.oraclecloud.com/n/sd63xuke79z3/b/neurodesk/o/${container}.simg
so, for example this URL should get you the fastest way to download our singularity container in the US: https://objectstorage.us-ashburn-1.oraclecloud.com/n/sd63xuke79z3/b/neurodesk/o/afni_21.2.00_20210714.simg
That should be good to automate on your end, right? Would you need anything else?
Thank you so much :) Steffen
ok, pushed them all in a branch to my fork (with git-annex branch being pushed to), if interested to have a pick/check: https://github.com/yarikoptic/containers/tree/enh-neurodesk-preview (images under https://github.com/yarikoptic/containers/tree/enh-neurodesk-preview/images/neurodesk). TODOs:
I didn't bother about metadata annotations at all yet :-/
Dear @yarikoptic,
That's great :)
one thing I noticed is that the versioning of the containers seems to be truncated: e.g.
neurodesk-qsmxt--1.1.simg neurodesk-qsmxt--1.3.simg
These are actually 4 containers:
qsmxt_1.1.9_20211219 categories:phase processing, quantitative imaging,structural imaging,workflows, qsmxt_1.1.10_20220302 categories:phase processing, quantitative imaging,structural imaging,workflows, qsmxt_1.1.13_20221018 categories:phase processing, quantitative imaging,structural imaging,workflows, qsmxt_1.3.2_20230204 categories:phase processing,quantitative imaging,structural imaging, workflows,
But it looks like it is truncated after the first 4 characters and therefore 1.1.9 1.1.10 and 1.1.13 are collapsed to neurodesk-qsmxt--1.1.simg
could that be or am I misunderstanding the versioning in datalad?
spinalcordtoolbox is affected by this as well: neurodesk-spinalcordtoolbox--5.simg
should be: spinalcordtoolbox_5.4_20211208 categories:spine, spinalcordtoolbox_5.5_20220419 categories:spine, spinalcordtoolbox_5.6_20220503 categories:spine, spinalcordtoolbox_5.7_20220803 categories:spine, spinalcordtoolbox_5.8_20230117 categories: spine,
it looks like all toolboxes with multiple versions where the version string is 3 digits are affected?
wow, that is a fun bug! the culprit is this line: https://github.com/ReproNim/containers/blob/master/scripts/create_singularities#L164 and in particular use of pathlib.Path's .with_suffix
- which now obvious in hindsight would take last number as the suffix (extension) and replace with .simg
. Yikes, so I screwed up pretty much all containers, uff... Let's redo ;) the easiest way for me would really be just to fix and redo so do not mind another 300GB of downloads ;) I will update here whenever it is done
ok, glad we spotted this early on :)
yeap -- thanks for the review. Lesson to myself: I am too old to code late in the evening ;) pushed now fixed state with various fixes.
please have another look
Dear @yarikoptic,
quick question: where are we with this? Is it all working and can I document this on the Neurodesk website? Does it need anything from me where I can help?
Kind regards Steffen
oh, right -- totally forgot this one opened. I think "we are done" and we are getting "upgrades" like d3486762991b265cbd6d2e141c207f670c61762c . I would assume that it is all working to the degree it could work here. Would be nice to get some example to solidify it all. Could you point me to some prototypical analysis using one of the containers ?
meanwhile the issue I think could be closed really.
Dear @yarikoptic,
I tried using a container with datalad containers-run
datalad containers-run -n neurodesk-bart
[INFO ] Making sure inputs are available (this may take some time)
get(error): images/neurodesk/neurodesk-bart--0.7.00.simg (file) [not available; (Note that these git remotes have annex-ignore set: origin)]
[INFO ] == Command start (output follows) =====
ERROR : Unknown image format/type: images/neurodesk/neurodesk-bart--0.7.00.simg
ABORT : Retval = 255
[INFO ] == Command exit (modification check follows) =====
run(error): /home/jovyan/test/containers (dataset) [./scripts/singularity_cmd run images/neu...]
action summary:
get (error: 1)
run (error: 1)
save (notneeded: 1)
other containers work - but the neurodesk containers through this error. Any idea :?
Maaan, you are even a bigger bug magnet than I am! Thank you for that ;)
I am staring at the code which should have added URLs -- https://github.com/ReproNim/containers/blob/master/scripts/create_singularities#L125 -- and do not get (without debugging) how that could have happened, besides (just thinking aloud) that git-annex managed to skip some (invalid?) ones somehow... I will debug, and might add it as a test of some kind
uff -- long friday, filed an issue against git-annex -- https://git-annex.branchable.com/bugs/registerurl_does_not_register_if_external_remote/?updated
wow - what a bug :/ Sorry to have attracted that one - it looks difficult to debug because to me there is currently no logical pattern why some work and some don't. I hope someone can point us to the root cause of this or any pointers in the right direction.
never enough tests... and here is there is none for availability...
ok -- worked around via actions performed in https://github.com/ReproNim/containers/blob/HEAD/scripts/utils/get-rid-of-datalad-remote.sh which should mitigate this issue.
Now I get an empty
$> git annex find --not --in web --and --not --in datalad-public
which I now made into a "test" within f406b16e84a68d94ba143e019198915aede3e4a4
Please check again if all is kosher now on a fresh clone, or just after smth like datalad update
Dear @yarikoptic,
That works perfectly now :) Thank you so much for fixing that!
One little question before I document this on the neurodesk website:
Currently it only seems possible to pull the latest version of each container.
Running:
datalad containers-run -n neurodesk-afni
downloads images/neurodesk/neurodesk-afni--23.0.07.simg
but it doesn't seem possible to run something like:
datalad containers-run -n images/neurodesk/neurodesk-afni--22.3.07.simg
How would a user access a specific version?
Thank you Steffen
Currently it only seems possible to pull the latest version of each container.
this is just a default behavior ;)
User can
neurodesk-afni-22.3.05
which would have a copy of that original definition but pointing to that other version image. Just look into .datalad/config
where those definitions are - we hope it is kinda self-explanatory ;)I hope this is clear'ish enough. Feedback (and contributions) are always welcome to improve things.
Dear @yarikoptic
cool - that default behavior makes a lot of sense :) And it's great versions can be changed in addition (and frozen!)
I tested the freez_versions script and found a little bug (sorry - but I think you are right about that bug magnet hypothesis):
scripts/freeze_versions neurodesk-romeo=3.2.4
topd_rel:
I: neurodesk-romeo -> 3.2.4
ERROR: There is no ./images/neurodesk/neurodesk-romeo--3.2.4.sing . Available images for the app are:
./images/neurodesk/neurodesk-romeo--3.2.4.simg
./images/neurodesk/neurodesk-romeo--3.2.7.simg
./images/neurodesk/neurodesk-romeo--3.2.8.simg
It assumes .sing as file ending?
However, changing the versions in .datalad/config worked like a charm and I now documented that on our website: https://www.neurodesk.org/docs/neurocontainers/datalad/
Thank you Steffen
Dear @yarikoptic,
Your freeze_versions fix is perfect :) It works nicely now to change the versions
scripts/freeze_versions neurodesk-romeo=3.2.4
topd_rel:
I: neurodesk-romeo -> 3.2.4
I documented this option as well on our website: https://www.neurodesk.org/docs/neurocontainers/datalad/#option-2-change-version-using-freeze_versions-script
Thank you Steffen
Great, thank you for the feedback! Let's again consider this overall issue addressed ;-)
images/neurodesk/
and just map image filenames to match what we have for othersJust FYI @stebo85 - the Mr. NeuroDesk