Open Quackdoc opened 4 months ago
I tried to compile the schedulers on my system but my machine was quickly rendered unusable. No matter what I do it uses all cores and it looks like it's hard coded to make_jobs = nproc. Im not sure what all is compiling with the make_jobs flag but that was the only thing I saw with a super quick glance through.
If you are on archlinux, you can consider to fetch it from the cachyos repository for now. Compiling the scx-scheds is pretty demanding, yes.
yeah, I plan on still investigating it for various reasons. I hard coded make_jobs to 8 and that alleviated the problem a bit but not much. Does cachyOS build from the AUR makepkg? it seems like they have been misconfigured as they don't set --buildtype release (arch-meson defaults to --buildtype plain) which causes the rust stuff to get compiled without the --release flag.
yeah, I plan on still investigating it for various reasons. I hard coded make_jobs to 8 and that alleviated the problem a bit but not much. Does cachyOS build from the AUR makepkg? it seems like they have been misconfigured as they don't set --buildtype release (arch-meson defaults to --buildtype plain) which causes the rust stuff to get compiled without the --release flag.
We use Arch's standard meson solutions. If you have your own, share them and then we'll take a look at it.
We use Arch's standard meson solutions. If you have your own, share them and then we'll take a look at it.
all you need to do as far as I can tell right now is add --buildtype release all arch-meson
is, is a simple exec script https://gitlab.archlinux.org/archlinux/packaging/packages/meson/-/blob/main/arch-meson?ref_type=heads, the issue here is --buildtype plain \
but with how meson parses arguments, its enough to simply do arch-meson . build --buildtype release
note that --buildtype release
is more or less the same thing as -Dbuildtype=release
but --buildtype
conflicts with -Dbuildtype
and meson will throw an error, why? no idea.
We use Arch's standard meson solutions. If you have your own, share them and then we'll take a look at it.
all you need to do as far as I can tell right now is add --buildtype release all
arch-meson
is, is a simple exec script https://gitlab.archlinux.org/archlinux/packaging/packages/meson/-/blob/main/arch-meson?ref_type=heads, the issue here is--buildtype plain \
but with how meson parses arguments, its enough to simply doarch-meson . build --buildtype release
note that--buildtype release
is more or less the same thing as-Dbuildtype=release
but--buildtype
conflicts with-Dbuildtype
and meson will throw an error, why? no idea.
I've updated PKGBUILDs.
@Quackdoc
Build targets in project: 19
sched_ext schedulers 0.1.7
User defined options
auto_features : enabled
buildtype : release
libexecdir : lib
prefix : /usr
sbindir : bin
wrap_mode : nodownload
python.bytecompile: 1
b_pie : true
Everything looks correct and the package is smaller by about 20 MB.
So, the problem is the connection between meson and cargo. meson thinks cargo is a regular command which takes up at most one CPU (like e.g. clang or gcc) but cargo takes up all CPUs in the machine. meson ends up running cargo for each rust scheduler, so we end up with a mess. It's something we need to improve but haven't found a good solution in meson.
Is it possible to configure cargo to limit concurrency? I tried to compile this repo (with the PKGBUILD) and hit a 80 load average... on a 6-core machine. It also almost caused OOM.
cargo concurrency can be set. for sure. However im wondering if cargo compile is being run at the same time as other compiles in parallel? I havent looked into the building scripts yet. but if it is, running it in sequence would help a lot, especially considering that cargo compile already threads quite well.
cargo concurrency can be set. for sure. However im wondering if cargo compile is being run at the same time as other compiles in parallel? I havent looked into the building scripts yet. but if it is, running it in sequence would help a lot, especially considering that cargo compile already threads quite well.
This is just anecdotal evidence, but I've seen duplicate rustc processes in htop. My guess is that each scheduler is compiled in parallel. Another evidence of this is the 80 load average being close to 12*6 (72).
I have an idea to fix this issue, but I haven't tried it and it might not be optimal:
It might be possible to store all the custom_targets for the Rust schedulers, create a dummy custom target that depends on each scheduler and the "next" scheduler to compile. This should create a chain of custom targets that will compile in sequence.
Here is a patch to do exactly that:
diff --git a/scheds/rust/meson.build b/scheds/rust/meson.build
index 52b227a..01eeaa3 100644
--- a/scheds/rust/meson.build
+++ b/scheds/rust/meson.build
@@ -1,5 +1,14 @@
+sched = []
+
subdir('scx_layered')
subdir('scx_rusty')
subdir('scx_rustland')
subdir('scx_rlfifo')
subdir('scx_lavd')
+
+custom_target('rust_scheds',
+ input: 'meson.build',
+ output: '@PLAINNAME@.__PHONY__',
+ command: ['touch', '@PLAINNAME@.__PHONY__'],
+ build_by_default: true,
+ depends: sched)
diff --git a/scheds/rust/scx_lavd/meson.build b/scheds/rust/scx_lavd/meson.build
index 3604564..1f6ee8a 100644
--- a/scheds/rust/scx_lavd/meson.build
+++ b/scheds/rust/scx_lavd/meson.build
@@ -1,7 +1,8 @@
-custom_target('scx_lavd',
+sched = custom_target('scx_lavd',
output: '@PLAINNAME@.__PHONY__',
input: 'Cargo.toml',
command: [cargo, 'build', '--manifest-path=@INPUT@', '--target-dir=@OUTDIR@',
cargo_build_args],
env: cargo_env,
- build_by_default: true)
+ build_always_stale: true,
+ depends: [sched])
diff --git a/scheds/rust/scx_layered/meson.build b/scheds/rust/scx_layered/meson.build
index 039f258..f006a10 100644
--- a/scheds/rust/scx_layered/meson.build
+++ b/scheds/rust/scx_layered/meson.build
@@ -1,8 +1,8 @@
-custom_target('scx_layered',
+sched = custom_target('scx_layered',
output: '@PLAINNAME@.__PHONY__',
input: 'Cargo.toml',
command: [cargo, 'build', '--manifest-path=@INPUT@', '--target-dir=@OUTDIR@',
cargo_build_args],
env: cargo_env,
- depends: [libbpf, bpftool_target],
- build_by_default: true)
+ build_always_stale: true,
+ depends: [libbpf, bpftool_target, sched])
diff --git a/scheds/rust/scx_rlfifo/meson.build b/scheds/rust/scx_rlfifo/meson.build
index 24d3b64..fd1ffb9 100644
--- a/scheds/rust/scx_rlfifo/meson.build
+++ b/scheds/rust/scx_rlfifo/meson.build
@@ -1,8 +1,8 @@
-custom_target('scx_rlfifo',
+sched = custom_target('scx_rlfifo',
output: '@PLAINNAME@.__PHONY__',
input: 'Cargo.toml',
command: [cargo, 'build', '--manifest-path=@INPUT@', '--target-dir=@OUTDIR@',
cargo_build_args],
env: cargo_env,
- depends: [libbpf, bpftool_target],
- build_by_default: true)
+ build_always_stale: true,
+ depends: [libbpf, bpftool_target, sched])
diff --git a/scheds/rust/scx_rustland/meson.build b/scheds/rust/scx_rustland/meson.build
index 9d7c29c..a243771 100644
--- a/scheds/rust/scx_rustland/meson.build
+++ b/scheds/rust/scx_rustland/meson.build
@@ -1,8 +1,8 @@
-custom_target('scx_rustland',
+sched = custom_target('scx_rustland',
output: '@PLAINNAME@.__PHONY__',
input: 'Cargo.toml',
command: [cargo, 'build', '--manifest-path=@INPUT@', '--target-dir=@OUTDIR@',
cargo_build_args],
env: cargo_env,
- depends: [libbpf, bpftool_target],
- build_by_default: true)
+ build_always_stale: true,
+ depends: [libbpf, bpftool_target, sched])
diff --git a/scheds/rust/scx_rusty/meson.build b/scheds/rust/scx_rusty/meson.build
index 0510756..b896ce7 100644
--- a/scheds/rust/scx_rusty/meson.build
+++ b/scheds/rust/scx_rusty/meson.build
@@ -1,8 +1,8 @@
-custom_target('scx_rusty',
+sched = custom_target('scx_rusty',
output: '@PLAINNAME@.__PHONY__',
input: 'Cargo.toml',
command: [cargo, 'build', '--manifest-path=@INPUT@', '--target-dir=@OUTDIR@',
cargo_build_args],
env: cargo_env,
- depends: [libbpf, bpftool_target],
- build_by_default: true)
+ build_always_stale: true,
+ depends: [libbpf, bpftool_target, sched])
@takase1121 did you test whether this actually works? If so, do you mind creating a PR?
@takase1121 did you test whether this actually works? If so, do you mind creating a PR?
I'll test it ASAP. Lemme check @htejun
@takase1121 Could you send a PR, please? I tested your patch and it seems that everything works OK. The load during compilation is significantly reduced and the compilation time increased only slightly.
@takase1121 I see a potential issue with this, trying to build a single scheduler also triggers the build of all the others, for example (from a fresh repo):
> meson compile -C build scx_rustland
INFO: autodetecting backend as ninja
INFO: calculating backend command to run: /usr/bin/ninja -C /home/arighi/src/scx/build scheds/rust/scx_rustland/Cargo.toml.__PHONY__
ninja: Entering directory `/home/arighi/src/scx/build'
[2/5] Generating scheds/rust/scx_layered/scx_layered with a custom command (wrapped by meson to set env)
...
So building a single target is now much slower than before. I'm wondering if it'd possible to make the chain of dependency optional passing a special build option or something similar...
That's an issue I've just realized now. I don't think there's a lot that can be done about it. Another issue is that the build process technically no longer obeys the meson job count, since cargo just does whatever it wants.
I think that limiting cargo concurrency is the way to go. We can revert my PR and implement the other approach. I don't have time right now to fix this and might take me another week to come out with something and test it. Maybe someone else can do it.
cargo has --jobs support. but even with that cargo is actually being called multiple times in parallel. so that's at least part of the solution
I don't think we should revert this change, preventing almost a fork-bomb when running meson compile
is already an improvement.
But I'm not really convinced that limiting cargo concurrency is the way to go, again, that would slow down the build of a single scheduler.
IMHO the best would be to limit meson concurrency instead (without adding extra dependencies) and rely on the "backend" concurrency level of cargo. In this way we won't slow down the build of single targets and we could still prevent the massive fork-bomb build.
Basically we should convince meson
to behave like make
: spawn a single build task by default, unless you specify a certain level of parallelism (-j N
).
I don't really understand why meson is doing the other way around... and apparently there's no way to change this default behavior in the meson files. Apparently the only way is to explicitly add -j 1
to the command line... or maybe there is a way and I haven't found it.
Well, there is always https://mesonbuild.com/Release-notes-for-1-3-0.html#automatic-fallback-to-cmake-and-cargo-subproject , it's probably not good, but just saying...
The main problem here is we are working with a meta build system. It's not really up to meson to set concurrency because Ninja is the one running everything. I don't think there's a way to disable this at all. Maybe cmake supports it with job pools (ref), but it'll be a miracle if meson would ever have something similar.
Another workaround I can see is to add the scheduler to a list, and then iterate the list to create the custom_target chain with the [last_scheduler, current_scheduler]
dependency, but this will require a big more tinkering because you can't get custom_target
name apparently. This is a workable solution but might be a bit uglier than the current solution.
For example: rust_lavd/meson.build:
# revert to original deps, keep the build_always_stale
scheds += custom_target(...)
rust/meson.build:
# you better find some input and output files because you need this for ALL custom_target.
last_dummy = []
for sched in scheds:
# figure out how to get target_name, or literally store it in another list
last_dummy = custom_target('wrapper_' + target_name, build_always_stale: true, depends: [last_dummy, sched])
endforeach
custom_target('finally_all_sched', ..., depends: [last_dummy])
Well, good luck with that.
I tried to compile the schedulers on my system but my machine was quickly rendered unusable. No matter what I do it uses all cores and it looks like it's hard coded to make_jobs = nproc. Im not sure what all is compiling with the make_jobs flag but that was the only thing I saw with a super quick glance through.