Closed jlsalmon closed 9 months ago
We can probably add support for passing block devices via --builds-dir
and --cache-dir
. Might be a bit tricky to map such mounts inside the VM especially if both --builds-dir
and --cache-dir
are specified.
Hello Justin ๐
Have you considered using a distributed cache instead?
It works just as snappy as a mounted directory/block device, but without the mounted directory bugs and extra complexity of managing the block devices (namely, Tart process running with elevated privileges, filesystem corruption due to concurrent access from multiple jobs, etc).
Here's a quick example snippet that once put in ~/.gitlab-runner/config.tom
, will cache everything to minio server /path/to/a/directory
(assuming that it's accessible on minio.local
):
[[runners]]
[runners.cache]
Type = "s3"
[runners.cache.s3]
ServerAddress = "minio.local:9000"
AccessKey = "minioadmin"
SecretKey = "minioadmin"
BucketName = "gitlab-cache"
Insecure = true
MinIO itself can be easily installed with brew install minio
, and you can either self-host (don't forget to configure proper security) or use cheap S3-compatible object storages like Cloudflare R2, Backblaze B2, etc.
Hi @edigaryev,
Many thanks for your reply, and for the suggestion.
Our use case involves building repositories that are hundreds of GBs in size, so using a pre-existing pre-cloned repo for each build is a hard requirement. Since the S3 caching feature up/downloads zips of the cached files on each run, unfortunately it would be prohibitively slow for us.
Reusing a repo is currently trivial on a bare metal macOS GitLab runner via [[runners]] builds_dir
. If it werenโt for the bugs in virtiofs, this would also be easily achieved using the existing โbuilds-dir
feature in this executor to mount the repo directory from the host.
So, we were hoping that we might have better luck using a block device to store repos and have it mounted in the build job. Ultimately we are looking to achieve build isolation for macOS builds (multiple Xcode versions, multiple OS versions, etc) as an improvement on our current bare metal solution.
We are also looking into some experimental early use of Docker for this purpose, from where I was originally pointed in this direction by @fkorotkov .
Hi @jlsalmon,
Thanks for clarifying this.
Assuming your goals (build isolation but no resource utilization), do I understand right that it would it be acceptable for you to only being able to run a single Tart VM when using the block device option?
Otherwise I don't see how more than one Tart VM would be able to access the same block device without causing filesystem corruption and other undesirable side-effects.
@edigaryev it would be acceptable to run only a single tart VM in the first instance, yes. I suppose it might be possible later on to support two VMs, either using two block devices (e.g. two partitions or two external drives) or or using unique build root directories per VM on a single block device. But one VM would be already a big achievement ๐
Another thing I have in mind is that you don't need an actual block device access to be able to mount a filesystem into a Tart VM.
You could simply do truncate -s 50GB disk.img
and mount it similarly to a block device using --disk disk.img
.
This disk image could reside anywhere you want (e.g. on a fast storage attached to a Mac) and does not require privilege escalation via sudo
and running Tart as root
.
The only downside is that this disk image needs to be properly formatted as APFS/HFS+/etc, which is something that diskutil
seems to be struggling with, expecting a real disk device as an input.
However, this disk image can be easily formatted from within a Tart VM and then re-used many times for subsequent CI runs.
Would that be acceptable for you? If so, we can simply add a --disk
argument to prepare
stage (similarly to --dir
argument) and that would do the trick, without requiring the sudo
privilege escalation mumbo-jumbo.
it would be acceptable to run only a single tart VM in the first instance, yes
Thanks for clarifying this too!
Please also check https://github.com/cirruslabs/gitlab-tart-executor/issues/60#issuecomment-1945625682 and let me know what you think.
If that works for you, we can even go further and clone that golden disk image for each Tart VM invocation (since APFS supports fast file copies using CoW). This will allow to safely run more than one Tart VM using a given golden disk image, the only downside is that changes to that cloned disk image won't be propagated back to the golden image.
@edigaryev thanks for the tip about using disk images, I'd be happy to give that a shot! I would need changes to the disk image(s) to be persisted across runs, so not sure about the cloning images part, but I'm sure there are options there ๐
@jlsalmon please check out the new 0.11.0
release, it now allows you to specify --disk
arguments in the prepare
stage, which in your case should point to a disk file on your host system.
Another option to actually mount the block device is to change its access permissions:
sudo chown $USER /dev/disk42
This way no sudo
/privilege escalation is needed, and you can use the same --disk
argument that you'd use to mount an additional disk image.
However, this approach is more error-prone (or in other words, requires some scripting around) than the disk image one, because macOS has no /dev/disk/by-id
symlinks, which requires you to find the actual new disk endpoint for your disk each time it's re-mounted (e.g. /dev/disk4
may become /dev/disk3
).
I've also just realized that we need a way to tell the GitLab Tart Executor whether the --builds-dir
and --cache-dir
are on host or in the guest to make this all work, so re-opening.
Thanks for this @edigaryev, I just tried it out and it's working great. I'm currently assessing the performance of a few different disk image types (sparse bundles, sparse images and regular disk images). I'll also try out your truncate
suggestion. Initial performance numbers suggest at least one of the types will be good enough for our use case ๐
@jlsalmon will be great if you could share your experience once it's working for you. Seems like a great piece of engineering you are working on!
So I found that the only method which has reasonable performance for my use case is directly attaching an APFS volume (either by physically partitioning the host disk, or using an external APFS-formatted storage device). I managed to achieve 60% of the host-native performance with this method on a 2020 M1 Mac mini running macOS 14.2.
File-based methods (sparsebundle, DMG, raw APFS-formatted file) were just too slow. The best was the sparsebundle, which achieved 25% of the host-native performance.
It seems that in my experience it's not possible to get near-native performance when disk IO is a dominant factor of the workload. I also tried XcodeBenchmark and could only reach a maximum of 65% of the host-native performance (again using an APFS volume attachment).
FYI @fkorotkov
Thank you for the data points! Can I ask you how did you run XcodeBenchmark in a VM. I can't reproduce the 65% slowness.
I use an M1 Mac Minis with 16Gb of memory and without isolation sh benchmark.sh
runs in 240-250 seconds. Inside a VM which matches host resources (tart set --cpu 8 --memory 16384 <VM-NAME>
) I'm getting around 260-270 seconds for the same script.
@fkorotkov on my 2020 M1 Mac mini with 8 CPUs/16GB memory without isolation, sh benchmark.sh
takes between 160-170 seconds:
$ git clone https://github.com/devMEremenko/XcodeBenchmark
$ cd XcodeBenchmark
$ sh benchmark.sh
Preparing environment
Running XcodeBenchmark...
Please do not use your Mac while XcodeBenchmark is in progress
..snip...
** BUILD SUCCEEDED ** [168.887 sec]
System Version: 14.2.1
Xcode 15.2
Hardware Overview
Model Name: Mac mini
Model Identifier: Macmini9,1
Total Number of Cores: 8 (4 performance and 4 efficiency)
Memory: 16 GB
Doing the same thing inside a fresh VM with matched host resources (not using any disk attachments) takes between 250-260 seconds:
$ tart clone ghcr.io/cirruslabs/macos-sonoma-xcode:15.2 test
$ tart set --cpu 8 --memory 16384 test
$ tart run --no-graphics test &
$ ssh admin@$(tart ip test)
admin@admins-Virtual-Machine ~ % git clone https://github.com/devMEremenko/XcodeBenchmark
admin@admins-Virtual-Machine ~ % cd XcodeBenchmark
admin@admins-Virtual-Machine XcodeBenchmark % sh benchmark.sh
Preparing environment
Running XcodeBenchmark...
Please do not use your Mac while XcodeBenchmark is in progress
...snip...
** BUILD SUCCEEDED ** [254.395 sec]
System Version: 14.3
Xcode 15.2
Hardware Overview
Model Name: Apple Virtual Machine 1
Model Identifier: VirtualMac2,1
Total Number of Cores: 8
Memory: 16 GB
I did many runs on both host and guest. I get roughly 66% of host performance on the guest. I'm not sure why my host results are faster than yours. It's worth noting that this Mac mini is booted from a Samsung T5 external SSD, and is not using the internal drive. But I would expect that to make the benchmark run slower, so that makes it even more confusing ๐ค.
Would it be possible to add support to the gitlab executor for using block devices? Specifically I would like to try using a block device as the builds/cache dirs (since virtiofs mounts via
--builds-dir
and--cache-dir
don't currently work due to cirruslabs/tart#567).I'm not sure how this would work with the requirement to run tart as root when attaching block devices though.