freebsd / poudriere

Port/Package build and test system
https://github.com/freebsd/poudriere/wiki
BSD 2-Clause "Simplified" License
389 stars 161 forks source link

build perf: gather_distfiles redundant and not using pre-build mk cache #1035

Closed bdrewery closed 5 months ago

bdrewery commented 1 year ago

cc @mjguzik

bdrewery commented 1 year ago
14:53:38 <@        mjg>| 0              0  32669  32522 install -pS -m 0644 /poudriere/distfiles//rexima-1.4.tar.gz /poudriere/data/.m/head-default/01/portdistfiles//rexima-1.4.tar.gz
14:53:43 <@        mjg>| this is the sucker i'm talking about
14:53:54 <@        mjg>| extra points taken for //
14:54:25 <@        mjg>| wait it gets worse
14:54:26 <@        mjg>| 0              0  32669  32522 install -pS -m 0644 /poudriere/distfiles//rexima-1.4.tar.gz /poudriere/data/.m/head-default/01/portdistfiles//rexima-1.4.tar.gz
14:54:29 <@        mjg>| 0              0  32751  32522 install -pS -m 0644 /poudriere/data/.m/head-default/01/portdistfiles//rexima-1.4.tar.gz /poudriere/distfiles//rexima-1.4.tar.gz
14:54:32 <@        mjg>| the file is going back?
14:54:43 <@        mjg>| note in the run the distfile was already present  before the build
14:55:12 <@        mjg>| so it is liek 'leave it be mofo' situation
14:59:22 <@        mjg>| this would definitely explain stalls from fsync
15:01:01 <@        mjg>| lemme respin rexima build one more time
15:04:41 <@mjg> bdrewery: confirmed!
15:05:02 <@mjg> bdrewery: see for yourself
15:05:08 <@        mjg>| 1. /usr/local/share/dtrace-toolkit/execsnoop -J > execs
15:05:16 <@        mjg>| 2. poudriere bulk -j head audio/rexima
15:05:33 <@        mjg>| 3. grep install execs
15:05:42 <@        mjg>| 0              0  39159  38948 install -pS -m 0644 /poudriere/distfiles//rexima-1.4.tar.gz /poudriere/data/.m/head-default/01/portdistfiles//rexima-1.4.tar.gz
15:05:45 <@        mjg>| 0              0  39241  38948 install -pS -m 0644 /poudriere/data/.m/head-default/01/portdistfiles//rexima-1.4.tar.gz /poudriere/distfiles//rexima-1.4.tar.gz
15:06:02 <@        mjg>| the distfile was there all along
15:07:24 <@mjg> bdrewery: https://dpaste.com/HRP9MQSFB
[[[
grep rexima-1.4.tar.gz execs
0              0  39159  38948 install -pS -m 0644 /poudriere/distfiles//rexima-1.4.tar.gz /poudriere/data/.m/head-default/01/portdistfiles//rexima-1.4.tar.gz
38             0  39164  39162 /bin/sh /usr/ports/Mk/Scripts/do-fetch.sh rexima-1.4.tar.gz
38             0  39191  39189 /bin/sh /usr/ports/Mk/Scripts/do-fetch.sh rexima-1.4.tar.gz
38             0  39213  39189 /bin/sh /usr/ports/Mk/Scripts/checksum.sh rexima-1.4.tar.gz
38             0  39226  39213 awk -v alg=SHA256 -v file=rexima-1.4.tar.gz $1 == alg && $2 == "(" file ")" {print $4} /usr/ports/audio/rexima/distinfo
0              0  39241  38948 install -pS -m 0644 /poudriere/data/.m/head-default/01/portdistfiles//rexima-1.4.tar.gz /poudriere/distfiles//rexima-1.4.tar.gz
37         65534  39250  39248 /bin/sh /usr/ports/Mk/Scripts/do-fetch.sh rexima-1.4.tar.gz
37         65534  39273  39248 /bin/sh /usr/ports/Mk/Scripts/checksum.sh rexima-1.4.tar.gz
37         65534  39286  39273 awk -v alg=SHA256 -v file=rexima-1.4.tar.gz $1 == alg && $2 == "(" file ")" {print $4} /usr/ports/audio/rexima/distinfo
37         65534  39291  39290 /usr/bin/tar -xf /portdistfiles//rexima-1.4.tar.gz --no-same-owner --no-same-permissions
37         65534  39308  39300 /bin/sh -e -c echo "_LICENSE_DISTFILES=rexima-1.4.tar.gz" >> /wrkdirs/usr/ports/audio/rexima/work/.license-catalog.mk
]]]
15:07:27 <@        mjg>| not gud!
15:21:33 <@        mjg>| $ grep gettext poudriere-execsnoop-20h | grep install | head
15:21:33 <@        mjg>| 0              0  32079   8846 install -pS -m 0644 /poudriere/distfiles//gettext-0.21.1.tar.xz /poudriere/data/.m/head-default/01/portdistfiles//gettext-0.21.1.tar.xz
15:21:35 <@        mjg>| 0              0  46212  23027 install -pS -m 0644 /poudriere/distfiles//gettext-0.21.1.tar.xz /poudriere/data/.m/head-default/17/portdistfiles//gettext-0.21.1.tar.xz
15:21:38 <@        mjg>| 0              0  76442   8846 install -pS -m 0644 /poudriere/data/.m/head-default/01/portdistfiles//gettext-0.21.1.tar.xz /poudriere/distfiles//gettext-0.21.1.tar.xz
15:21:41 <@        mjg>| 0              0  76697  23027 install -pS -m 0644 /poudriere/data/.m/head-default/17/portdistfiles//gettext-0.21.1.tar.xz /poudriere/distfiles//gettext-0.21.1.tar.xz
15:22:00 <@        mjg>| i think fixing this will be a massive i/o win
mjguzik commented 1 year ago

So I wrote a bad patch just to check the difference:

gather_distfiles: if [ -f "${to}/${sub}/${d}" ]; then msg "not copying into ${to}/${sub}/${d}, already exists" else install -pS -m 0644 "${from}/${sub}/${d}" \ "${to}/${sub}/${d}" || return 1 fi

with this in place i/o problems almost disappeared. zfs is now 2% of off cpu time, whereas previously it was about half along with an artificial delay it injected to let storage keep up

that said, please fix correctly, kthx :)

mjguzik commented 11 months ago

@bapt I wrote the following which should is a bare minimum hack: https://people.freebsd.org/~mjg/patches/poudriere-spurious-install.diff (note optional debug)

built over 8k of ports no problem and avoided copies a lot

I would say this should go in if no actual fix lands in the foreseeable future.