buildstream-migration / bst-staging

GNU Lesser General Public License v2.1
0 stars 0 forks source link

Zero-copy local cache #1347

Open Cynical-Optimist opened 4 years ago

Cynical-Optimist commented 4 years ago

See original issue on GitLab In GitLab by [Gitlab user @lle-bout] on Jul 1, 2020, 09:22

Background

Over at freedesktop-sdk, we noticed a major chunk of the build time is spent on either network I/O for remote caching, or disk I/O for copying back and forth with the local cache server.

Task description

(Suggested implementation choices, best would be https://virtio-fs.gitlab.io/ between Docker containers, as they are used by freedesktop-sdk and Gitlab CI, for a shared cache folder, and handle concurrent access to this cache folder with POSIX file system locking)

Acceptance Criteria

This is a relatively medium sized task that could be implemented short-mid term and provide huge build performance improvements for projects like freedesktop-sdk who make heavy continuous use of Buildstream.


Cynical-Optimist commented 4 years ago

In GitLab by [Gitlab user @lle-bout] on Jul 1, 2020, 09:29

changed the description

Cynical-Optimist commented 4 years ago

In GitLab by [Gitlab user @lle-bout] on Jul 1, 2020, 09:29

changed the description

Cynical-Optimist commented 4 years ago

In GitLab by [Gitlab user @lle-bout] on Jul 1, 2020, 09:35

It seems that there already was progress towards that goal here: https://gitlab.com/BuildGrid/buildbox/buildbox-fuse

Cynical-Optimist commented 4 years ago

In GitLab by [Gitlab user @sstriker] on Jul 3, 2020, 00:43

Personally I think this issue is proposing a solution a bit too quickly.

we noticed a major chunk of the build time is spent on either network I/O for remote caching, or disk I/O for copying back and forth with the local cache server.

[Gitlab user @lle-bout], thanks for reporting this issue. Do you have any profiling data as to where/when time is being spent?

As you note there is buildbox-fuse that maybe be used by buildbox-casd if so configured for staging. The current implementation in the master branch already uses buildbox-run that can leverage this functionality.

There will be other opportunities for optimization - please provide detailed data on the patterns you are seeing.

Cynical-Optimist commented 4 years ago

In GitLab by [Gitlab user @lle-bout] on Aug 13, 2020, 21:53

[Gitlab user @sstriker] It's okay, I was just proposing something. I need to collect more precise data, but the current guesses based on monitoring tools such as netdata, iotop or htop - suggest that copying data back and forth in and out of a local cache server as well as a remote cache server is very expensive and wastes IOPS a lot. There's first, zero-copy local caching but there's also lazily fetching from caches, which isnt the case currently. Even if there's nothing to build, it has to copy all the caches (either from local or remote if local doesnt have it) to declare the build as success.

Cynical-Optimist commented 4 years ago

In GitLab by [Gitlab user @sstriker] on Aug 14, 2020, 00:17

[Gitlab user @lle-bout] #1272, #1273, #1274, #1275 and actually only downloading blobs when necessary for the operation should help with this.

I've been aiming to have that be the default behavior in combination with RE. That is, only deal with Trees, or file references if you will, and not with the actual file content itself when not executing the build locally.