Closed wommy closed 1 year ago
That's weird. The DEBIAN_RELEASE arg has a default, so I'm not sure why it would fail. I have never used the debian convenience script (thanks for the tip), so I'll give it a go to see if a can reproduce this problem.
Just so you know, build is still failing for the proxmox kernel master branch (kernel 6.2) so I'm working on it now (down to uploading artifacts now). I'll probably re-run it for kernel 5.15 as well to make sure it still works.
Edit: BTW, you might want to consider running the build on a VM on your host, not on the host itself. I like to keep my host just for VMs and LXC and run everything else off of VM or LXCs.
@wommy , what docker did you end up installing? This is what I have:
$ docker -v
Docker version 20.10.21, build 20.10.21-0ubuntu1~22.04.3
Also, I would install docker propertly with apt-get install docker.io
to see if that helps (you might need to undo what the convenience script did first).
Lastly, do a git pull
to get the latest changes to the build script.
Docker version 24.0.1, build 6802122
Yeah, go with apt-get install docker.io
then.
That's weird. The DEBIAN_RELEASE arg has a default, so I'm not sure why it would fail. I have never used the debian convenience script (thanks for the tip), so I'll give it a go to see if a can reproduce this problem.
Just so you know, build is still failing for the proxmox kernel master branch (kernel 6.2) so I'm working on it now (down to uploading artifacts now). I'll probably re-run it for kernel 5.15 as well to make sure it still works.
Edit: BTW, you might want to consider running the build on a VM on your host, not on the host itself. I like to keep my host just for VMs and LXC and run everything else off of VM or LXCs.
yeah this is what i did at first, but then i installed docker onto the host bc to me, thats what locally mean
i think im going to just reinstall pve and then do this in a vm proper
it seems like youre working with the old version of docker
https://docs.docker.com/engine/install/ubuntu/#uninstall-old-versions
Older versions of Docker went by the names of docker, docker.io, or docker-engine, you might also have installations of containerd or runc. Uninstall any such older versions before attempting to install a new version:
I'm using whatever is available on Ubuntu 22.04, Debian 11 and Proxmox 7.4. It should be fine. I'm not doing anything fancy with it. I'd rather use the one that came with those distributions as it makes things simpler.
wom@kernel:~/pve-kernel-builder$ ./build.sh
Preparing container...
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/build?buildargs=%7B%22DEBIAN_RELEASE%22%3A%22%22%2C%22REPO_BRANCH%22%3A%22%22%2C%22REPO_URL%22%3A%22%22%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&shmsize=0&t=pve-kernel-build&target=&ulimits=null&version=1": dial unix /var/run/docker.sock: connect: permission denied
Building PVE kernel...
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create": dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.
wom@kernel:~/pve-kernel-builder$ sudo ./build.sh
Preparing container...
Sending build context to Docker daemon 312.3kB
Step 1/15 : ARG DEBIAN_RELEASE=bullseye
Step 2/15 : FROM debian:${DEBIAN_RELEASE}-slim
invalid reference format
Building PVE kernel...
Unable to find image 'pve-kernel-build:latest' locally
docker: Error response from daemon: pull access denied for pve-kernel-build, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.
these are both the vm with
wom@kernel:~/pve-kernel-builder$ docker -v
Docker version 20.10.21, build 20.10.21-0ubuntu1~22.04.3
gonna try the container next
You need to add your user wom
to the docker group so that it can interface with it.
usermod -G docker -a wom
wom@kernel:~$ cd
.cache/ pve-kernel-builder/ .ssh/
wom@kernel:~$ cd pve-kernel-builder/
wom@kernel:~/pve-kernel-builder$ ./build.sh
Preparing container...
Sending build context to Docker daemon 312.3kB
Step 1/15 : ARG DEBIAN_RELEASE=bullseye
Step 2/15 : FROM debian:${DEBIAN_RELEASE}-slim
invalid reference format
Building PVE kernel...
Unable to find image 'pve-kernel-build:latest' locally
docker: Error response from daemon: pull access denied for pve-kernel-build, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.
wom@kernel:~/pve-kernel-builder$ sudo ./build.sh
[sudo] password for wom:
Preparing container...
Sending build context to Docker daemon 312.3kB
Step 1/15 : ARG DEBIAN_RELEASE=bullseye
Step 2/15 : FROM debian:${DEBIAN_RELEASE}-slim
invalid reference format
Building PVE kernel...
Unable to find image 'pve-kernel-build:latest' locally
docker: Error response from daemon: pull access denied for pve-kernel-build, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.
wom@kernel:~/pve-kernel-builder$
nope, its trying to pull pve-kernel-builders from docker hub
That's not it. You haven't been able to build a docker image yet.:
[...]
Step 2/15 : FROM debian:${DEBIAN_RELEASE}-slim
invalid reference format
Not sure what this 'invalid reference format' is about.
You shouldn't need to use sudo for this either. Everything should be doable from a regular user account.
When docker build
works, you'll end up with a local image called pve-kernel-build:latest
. That's the image that will be used for the build itself.
That also made me realize that I've been using env vars all this time. Sorry about that. I'll work on a change to fix build.sh to use parameters instead.
If you want to try to get past this for now, just set these variable in your env directly:
export DEBIAN_RELEASE=bullseye
export REPO_URL=git://git.proxmox.com/git/pve-kernel.git
export REPO_BRANCH=master
Use branch master
for kernel 6.2.16 or branch pve-kernel-5.15
for kernel 5.15.107. See if you can generate the docker image after that.
Thanks!
@wommy, do a git pull
to get the updates to build.sh. Let me know if you get the image built. I have to leave but I'll be back online in a few hours.
a7b86d44cbf8e896e6a597a0a29c55692ec29a2b should fix this issue.
my disk keeps filling up, it's at 180GBs this is the fourth time ive ran it I thought docker was supposed to standardize differences
this clean script doesn't work
it's there an option to use more memory? it uses way less than 10gs, close to 5 or 6
Docker caches intermediary images as you go, so if you're developing or having trouble getting what you want, it keeps caching things. Try these:
To see the images in your system: docker images
To remove an image: docker rmi <hash or name>
To see which containers are running: docker ps
or docker ps -a
to show all containers (running or not)
To stop a container: docker stop <name | hash>
To remove a container: docker rm <name | hash>
Clean the containers and images you don't need as you go.
this clean script doesn't work it's there an option to use more memory? it uses way less than 10gs, close to 5 or 6
I'm confused. Are you able to run it? What doesn't work? What do you want to make use more memory? The image creation or the compilation itself?
I'm not limiting the docker image in any way, so it's supposed to use all cores and memory available in the system. Compilation here takes about 30-35min on a 32-core machine (vs. 2h40m in a 2-core GitHub worker VM). I haven't checked memory usage.
I'm filling up on physical disk, it's generating bogus files
first time I run, some error happens, and then second time it says disk is full, I could put the whole kernel in ram
you said give it 40 ga but i gave it 60 and it filled up, and now it's at 180 and it's giving me a disk full, no more space error
My image is 100GB and it's enough. GitHub's is 80GB but w/ 31GB free. After cleaning it up a bit, it's also enough. All you need is ~40GB free.
What are the bogus files you're referring to? What's generating them? docker? The build process?
overlayfs junk, i deleted the vm - redoing it in a container first, then a vm
rm -rf /build/pve-kernel/build/ubuntu-kernel_tmp
# finalize
/sbin/depmod -b debian/pve-kernel-5.15.107-2-pve-relaxablermrr/ 5.15.107-2-pve-relaxablermrr
touch .headers_compile_mark
cp ubuntu-kernel/include/generated/compile.h debian/pve-headers-5.15.107-2-pve-relaxablermrr/usr/src/linux-headers-5.15.107-2-pve-relaxablermrr/include/generated/compile.h
install -m 0644 ubuntu-kernel/Module.symvers debian/pve-headers-5.15.107-2-pve-relaxablermrr/usr/src/linux-headers-5.15.107-2-pve-relaxablermrr
mkdir -p debian/pve-headers-5.15.107-2-pve-relaxablermrr/lib/modules/5.15.107-2-pve-relaxablermrr
ln -sf /usr/src/linux-headers-5.15.107-2-pve-relaxablermrr debian/pve-headers-5.15.107-2-pve-relaxablermrr/lib/modules/5.15.107-2-pve-relaxablermrr/build
touch .headers_install_mark
# Autogenerate blacklist for watchdog devices (see README)
install -m 0755 -d debian/pve-kernel-5.15.107-2-pve-relaxablermrr/lib/modprobe.d
ls debian/pve-kernel-5.15.107-2-pve-relaxablermrr/lib/modules/5.15.107-2-pve-relaxablermrr/kernel/drivers/watchdog/ > watchdog-blacklist.tmp
echo ipmi_watchdog.ko >> watchdog-blacklist.tmp
cat watchdog-blacklist.tmp|sed -e 's/^/blacklist /' -e 's/.ko$//'|sort -u > debian/pve-kernel-5.15.107-2-pve-relaxablermrr/lib/modprobe.d/blacklist_pve-kernel-5.15.107-2-pve-relaxablermrr.conf
rm -f debian/pve-kernel-5.15.107-2-pve-relaxablermrr/lib/modules/5.15.107-2-pve-relaxablermrr/source
rm -f debian/pve-kernel-5.15.107-2-pve-relaxablermrr/lib/modules/5.15.107-2-pve-relaxablermrr/build
touch .install_mark
dh_installdocs -A debian/copyright debian/SOURCE
dh_installchangelogs
dh_installman
dh_strip_nondeterminism
dh_compress
dh_fixperms
debian/rules fwcheck abicheck
make[2]: Entering directory '/build/pve-kernel/build'
make[2]: warning: jobserver unavailable: using -j1. Add '+' to parent make rule.
debian/scripts/find-firmware.pl debian/pve-kernel-5.15.107-2-pve-relaxablermrr/lib/modules/5.15.107-2-pve-relaxablermrr >fwlist.tmp
mv fwlist.tmp fwlist-5.15.107-2-pve-relaxablermrr
checking fwlist for changes since last built firmware package..
if this check fails, add fwlist-5.15.107-2-pve-relaxablermrr to the pve-firmware repository and upload a new firmware package together with the 5.15.107-2-pve-relaxablermrr kernel
sort fwlist-previous | uniq > fwlist-previous.sorted
sort fwlist-5.15.107-2-pve-relaxablermrr | uniq > fwlist-5.15.107-2-pve-relaxablermrr.sorted
diff -up -N fwlist-previous.sorted fwlist-5.15.107-2-pve-relaxablermrr.sorted > fwlist.diff
rm fwlist.diff fwlist-previous.sorted fwlist-5.15.107-2-pve-relaxablermrr.sorted
done, no need to rebuild pve-firmware
debian/scripts/abi-generate debian/pve-headers-5.15.107-2-pve-relaxablermrr/usr/src/linux-headers-5.15.107-2-pve-relaxablermrr/Module.symvers abi-5.15.107-2-pve-relaxablermrr 5.15.107-2-pve-relaxablermrr
debian/scripts/abi-check abi-5.15.107-2-pve-relaxablermrr abi-prev-*
II: Checking ABI...
II: Different ABI's, running in no-fail mode
Reading symbols/modules to ignore...read 0 symbols/modules.
Reading new symbols (5.15.107-2-pve-relaxablermrr)...read 26321 symbols.
Reading old symbols...read 26321 symbols.
II: Checking for missing symbols in new ABI...found 0 missing symbols
II: Checking for new symbols in new ABI...found 0 new symbols
II: Checking for changes to ABI...
II: Done
make[2]: Leaving directory '/build/pve-kernel/build'
dh_strip -Npve-headers-5.15.107-2-pve-relaxablermrr -Npve-kernel-libc-dev
dh_makeshlibs
dh_shlibdeps
dh_installdeb
dh_gencontrol
dpkg-gencontrol: warning: package pve-headers-5.15.107-2-pve-relaxablermrr: substitution variable ${shlibs:Depends} unused, but is defined
dh_md5sums
dh_builddeb
dpkg-deb: building package 'linux-tools-5.15' in '../linux-tools-5.15_5.15.107-2_amd64.deb'.
dpkg-deb: building package 'pve-kernel-libc-dev' in '../pve-kernel-libc-dev_5.15.107-2_amd64.deb'.
dpkg-deb: building package 'linux-tools-5.15-dbgsym' in '../linux-tools-5.15-dbgsym_5.15.107-2_amd64.deb'.
dpkg-deb: building package 'pve-headers-5.15.107-2-pve-relaxablermrr' in '../pve-headers-5.15.107-2-pve-relaxablermrr_5.15.107-2_amd64.deb'.
dpkg-deb: building package 'pve-kernel-5.15.107-2-pve-relaxablermrr' in '../pve-kernel-5.15.107-2-pve-relaxablermrr_5.15.107-2_amd64.deb'.
make[1]: Leaving directory /build/pve-kernel/build
dpkg-genbuildinfo --build=binary
dpkg-genchanges --build=binary >../pve-kernel_5.15.107-2_amd64.changes
dpkg-genchanges: info: binary-only upload (no source code included)
dpkg-source --after-build .
dpkg-buildpackage: info: binary-only upload (no source included)
lintian pve-kernel-5.15.107-2-pve-relaxablermrr_5.15.107-2_amd64.deb
#lintian pve-headers-5.15.107-2-pve-relaxablermrr_5.15.107-2_amd64.deb
lintian linux-tools-5.15_5.15.107-2_amd64.deb
Exporting artifacts...
+ echo Exporting artifacts...
+ mkdir -p /build/output/artifacts
mkdir: cannot create directory /build/output/artifacts: Permission denied
+ cp linux-tools-5.15-dbgsym_5.15.107-2_amd64.deb linux-tools-5.15_5.15.107-2_amd64.deb pve-headers-5.15.107-2-pve-relaxablermrr_5.15.107-2_amd64.deb pve-kernel-5.15.107-2-pve-relaxablermrr_5.15.107-2_amd64.deb pve-kernel-libc-dev_5.15.107-2_amd64.deb /build/output/artifacts
cp: target /build/output/artifacts is not a directory
Exporting abi files from build to /build/output...
+ for d in build pve-kernel-*
+ [[ -d build ]]
+ echo Exporting abi files from build to /build/output...
+ cp build/abi-5.15.107-2-pve-relaxablermrr build/abi-blacklist build/abi-prev-5.15.107-2-pve /build/output
cp: cannot create regular file /build/output/abi-5.15.107-2-pve-relaxablermrr: Permission denied
cp: cannot create regular file /build/output/abi-blacklist: Permission denied
cp: cannot create regular file /build/output/abi-prev-5.15.107-2-pve: Permission denied
+ for d in build pve-kernel-*
+ [[ -d pve-kernel-5.15.107-2-pve-relaxablermrr_5.15.107-2_amd64.deb ]]
+ for d in build pve-kernel-*
+ [[ -d pve-kernel-libc-dev_5.15.107-2_amd64.deb ]]
like is there a debug line that can just dump all my errors to a log and i share that?
It looks like you got a build. I was getting the same error. Did you pick up 2e8eb77a226d3935b68c3cec8e884e684bed1815 by any chance?
yeah i did, and this still happened, i think it needs to be reinserted or something ๐คฆโโ๏ธ๐คฆโโ๏ธ
also might be container things, its unprivileged but i gave it keyctl, if you have any advice on options, i can run it back
im going to do the same with a vm here shortly before i head out for lunch, let it run while im gone
srsly is there a way to log errors to some kind of txt file
i also noticed some during some step of the kernel build process, something like 1044 was longer than 1024, some C stuff i think
oh shit wait, did https://github.com/brunokc/pve-kernel-builder/commit/2e8eb77a226d3935b68c3cec8e884e684bed1815 make it into the dockerfile?
also might be container things, its unprivileged but i gave it keyctl, if you have any advice on options, i can run it back
I'm doing my best to keep the containers unpriviledged so they won't need any special setup.
srsly is there a way to log errors to some kind of txt file
container logs (or its stdout) can be accessed via docker logs <name | id>
i also noticed some during some step of the kernel build process, something like 1044 was longer than 1024, some C stuff i think
I saw that as well. They are coming from the sources. These are just warnings though so I would not worry about them.
oh shit wait, did 2e8eb77 make it into the dockerfile?
Humm, never mind. That commit went into the workflow itself, so it won't affect local builds via build.sh. ๐
also have you tried act? for local github actions
I saw someone mentioning act, but I haven't tried it. Someone mentioned it was not exactly the same and that through me off a little bit.
At this point, the workflows seem to be working fine in GitHub VMs -- got two releases out yesterday by running the workflow twice, once per branch.
I want to make them work in self-hosted VMs as well, but I want that to work by having the runner in a container (I don't want to spin a new VM for every workflow run and I don't want workflows messing with the VM config). That's a bit more challenging since it involves a container for the runner and if the workflow uses container (which it does), that means creating new container from within the runner container. I have that part working, but I got hit with file permissions issues (like the workflow issue above) where a user in the runner container has to match a user in the build container ๐คฆโโ๏ธ. I'll chip at it slowly.
What have you been trying? Local builds or builds via workflow?
What have you been trying? Local builds or builds via workflow?
ive been spinning up containers, git clone, cd, and ./build.sh, i havent spun up an actual vm in a few months, i forgot how cumbersome it is, so im working on a ubuntu-2204-server-cloudimg w/ cloud-init that i can just clone and ./build.sh
(I don't want to spin a new VM for every workflow run and I don't want workflows messing with the VM config)
this cloud server thing im setting up sounds exactly perfect for this, but i feel you its a pain, i was trying to get it up before lunch, but now its hours later
I want to make them work in self-hosted VMs as well, but I want that to work by having the runner in a container (I don't want to spin a new VM for every workflow run and I don't want workflows messing with the VM config). That's a bit more challenging since it involves a container for the runner and if the workflow uses container (which it does), that means creating new container from within the runner container. I have that part working, but I got hit with file permissions issues (like the workflow issue above) where a user in the runner container has to match a user in the build container ๐คฆโโ๏ธ. I'll chip at it slowly.
i semi follow; i dunno, i really like the steps, but i dont see why they arent broken up more? to me, theyre just broken up enough that its non-linear, yet like i feel they could be broken up more, some type of uncanny valley of modules
i dont see why the workflow doesnt use the build script or the build script uses the workflow, but then again i have no clue what im talking about in this regard specifically but i know debugging nightmares
i installed act thru github-cli, couldnt get it to work otherwise, and then it mega complained about the workflows, about them being named similarly or something. but then i read some docs and everyone does that and like chains the workflows, i kinda get it, but only tangentically, id be interested in helping troubleshoot some of this too
all that aside,
ive have been adding in my apt cache proxy, to speed up builds by not fetching the same shit over and over im wondering if theres something similar for docker, some kind of proxy cache outside the ct/vm that i could point to that would presist when i nuked it,
the uncanny module valley, this is what im talking about: like the script is very procedural, but i dont see why each step couldnt be split out and then tested in isolation, and then chained, i could rant on this more after i get back
also i have been doing this all over NFS as i just reinstalled PVE on a 64g mSD card i had laying around, as i only just got the server a week or two ago, i still havent gotten around to mucking with the zpool, with this 'new' server, i need to do a big hardware reorganize: a bunch of drives, pcie cards
ugh when i bought this server, i didnt think id be compiling the kernel just to play some video games ๐คฃ๐คฃ๐ฎโ๐จ๐ฎโ๐จ
What have you been trying? Local builds or builds via workflow?
ive been spinning up containers, git clone, cd, and ./build.sh, i havent spun up an actual vm in a few months, i forgot how cumbersome it is, so im working on a ubuntu-2204-server-cloudimg w/ cloud-init that i can just clone and ./build.sh
Alright. I've been concentrating on making the workflows work. Now that they do, I'll get build.sh in good shape.
i semi follow; i dunno, i really like the steps, but i dont see why they arent broken up more? to me, theyre just broken up enough that its non-linear, yet like i feel they could be broken up more, some type of uncanny valley of modules
That's probably my fault, at least in part. I'm no docker specialist so I'm basing my stuff on examples and what makes sense to me at this time. The steps are basically divided in these macro steps:
Look at the steps in build-pve-kernel-container.yml and see if you can follow along. Look under steps:
and follow each step listed under - name:
entries.
i dont see why the workflow doesnt use the build script or the build script uses the workflow, but then again i have no clue what im talking about in this regard specifically but i know debugging nightmares
Yeah, I understand your point. The workflow thing is just for GitHub though, so they don't get used when building locally. The part that is common, which is the kernel build itself, is being re-used in the form of a docker image. The image is built the same way for workflows and local builds, which means the kernel build happens the same way in workflows and locally (build.sh).
i installed act thru github-cli, couldnt get it to work otherwise, and then it mega complained about the workflows, about them being named similarly or something. but then i read some docs and everyone does that and like chains the workflows, i kinda get it, but only tangentically, id be interested in helping troubleshoot some of this too
Again, probably my fault here too. I started in one direction, changed course and didn't clean up properly. Now that the main workflow is working I'll take a pass and clean things up a bit.
The chaining of workflows is used for re-usability so you don't duplicate logic as much. It does make things a bit harder to follow though. ๐
Hopefully it will be clearer once I remove the cruft and rename things.
all that aside,
ive have been adding in my apt cache proxy, to speed up builds by not fetching the same shit over and over im wondering if theres something similar for docker, some kind of proxy cache outside the ct/vm that i could point to that would presist when i nuked it,
That's an interesting idea. It could at least speed up development (it won't help in regular runs as much I think). I'm not aware of a docker image cache other than the one that it build locally (docker will cache an image after every step it takes during the docker build
phase). That only useful if you're playing with your Dockerfile though. Worth investigating though, to see if that cache can be set on another machine or container so it can persist.
the uncanny module valley, this is what im talking about: like the script is very procedural, but i dont see why each step couldnt be split out and then tested in isolation, and then chained, i could rant on this more after i get back
I assume by 'chaining' you mean having workflows call other workflows. That is a bit more cumbersome than I anticipated, and it carries a cost, so I'm only doing it when necessary. Besides, some of the steps in a workflow provide data to the next and that's easier done when the steps are in the same workflow.
ugh when i bought this server, i didnt think id be compiling the kernel just to play some video games ๐คฃ๐คฃ๐ฎโ๐จ๐ฎโ๐จ
Ah, the joys of passthrough in Proxmox ๐คฃ๐คฃ๐คฃ
i installed docker on a fresh pve install via the debian convenience script
then i cloned the repo and tried to build it via the build script