vincentdephily / emlop

EMerge LOg Parser
GNU General Public License v3.0
40 stars 5 forks source link

Scaling with MAKEOPTS -jN -lM? #53

Open APN-Pucky opened 1 month ago

APN-Pucky commented 1 month ago

Is it possible to read the MAKEOPTS from the log and scale the predictions by the number of cores? Sometimes I use -j1 -l1in the background and sometimes I use -j113 -l16 with distcc resulting in different run times per build? Is it possible to separate them? Same goes for builds that have a binpkg and are very quick then.

emlop a shows many >1000% due to different install modes on my system.

vincentdephily commented 4 weeks ago

I wish I could say yes, and I'm always looking for ways to improve the prediction, but there are a lot of hurdles:

The only thing I can think of in your case, is that you could configure a different emerge.log when you use distcc, and then tell emlop to use the best log for each case. If you try something like that, I've love to get your return on experience, and maybe some stats from emlop accuracy.

One thing that emlop could infer, is whether parallel merges (from a single emerge command, or multiple) are ongoing, by looking if events for different ebuilds are interleaved. Again, I'm not sure how usable that data would be for predicting build times, but gathering the data is the first step.

APN-Pucky commented 2 weeks ago

Thanks for the detailed answer, I have some ideas of trying to get that information, but very little time. emlop predict shows the current line of the output log and there sometimes is sth. like gcc -jN, maybe that could be used? I havent checked if binary packages look different in the logs somehow to infer that.

The speedup you get from parallel compiles is anything but linear, it's unclear how emlop would use that info if it was available.

Simplest would be to just use the closest match for predictions and track them separately per parallelization level, I guess most people have either a single thread background compile or a fast all core compile.

kakra commented 2 weeks ago

I don't think that should be tried, it just won't scale correctly, never. You cannot expect a small package to be able to do 113 gcc processes in parallel, and even with big packages, it is unlikely to scale to that number due to inter-file dependencies. What should a scale factor be? Just measure a -j113 and multiply by 113 for a -j1 build? That's actually not how it works. Additionally, you're using -l16 which makes things even more difficult.

Also, some compiler processes do actually use multithreading, e.g. lto phases. This completely works against MAKEOPTS trying to run processes in parallel.

You should probably also look into something like EMERGE_DEFAULT_OPTS="--jobs=5 --load-average 8" so you can do parallel packages. Due to how many build systems work (long configure phase, low parallelism due to inter-file deps), there's great potential to save overall build time. I usually keep those numbers a little lower than my MAKEOPTS. It can overwhelm RAM usage if used wrongly.

You probably get better results by using different emerge.log files per MAKEOPTS configuration.

But I'd also prefer if emlop could tell binpkg merges and full merges apart. I'm using binpkg to cache packages so if a downgrade is needed, it simply re-uses the previous build. That leads to vastly shorter "build times". But luckily, emlop uses a median filter so this extreme "noise" is likely to be filtered away from predictions.

APN-Pucky commented 2 weeks ago

scaling != linear scaling and if I understand it right it can be predicted per package, some can do -j100 others not. Nonetheless, I think tracking each -j1,-j2,...,-jN separately would be nice and probably more precise. Sorry for my short answer.

vincentdephily commented 2 weeks ago

Getting the level of parallelism for an ongoing merge isn't too hard: emlop could look at CPU utilization of the emerge processes (let's ignore the distcc usecase for now). It's also easy enough to get lots of info (USE, CFLAGS...) about the currently-installed package by looking into /var/db/pkg/, and with some work we could find the same info about an ongoing merge.

The hard/impossible part is getting historical information, which is what predictions are based on. Was the python emerge from a month ago done with distcc ? With parallel emerges ? With binpkg ? With USE=pgo ? I don't know. I'd be happy to be proven wrong, can you find more useful historical info in your emerge.log or other places ?

Have a look at 52204954417f38828e1b53e18329274f2b693c2e and def6d4215267d0575519a96a7d956c25d7267173. I was hoping to figure out portage-level parallelism, but I've given on that for now due to a too high error rate. Again, I'd be happy to be proven wrong, feel free to pick up that branch and make it work.

Thank you both for this discussion. Even if it turns out we can't implement them, it's good to brainstorm ideas.