vmware / photon

Minimal Linux container host
https://vmware.github.io/photon
Other
3.04k stars 698 forks source link

Make of Photon OS image fails #1370

Closed dcasota closed 1 year ago

dcasota commented 1 year ago

Describe the bug

Hi,

Building a Photon OS 3.0 image fails with:

Fails Some packages failed:
['nss-3.72']
Failed during building packages

Building a Photon OS 4.0 image fails with:

Could not find spec for createrepo
Could not find spec for linux-dtb-ls1012afrwy
Could not find spec for linux-dtb-rpi
Could not find spec for u-boot

Can you provide a recipe to make it work?

Reproduction steps

  1. Provision Photon OS photon-hw13_uefi-3.0-9355405.ova
  2. make sure the disk has enough capacity e.g. 64GB.
  3. script The first part works flawlessly.

    # repo
    if [ `cat /etc/yum.repos.d/photon.repo | grep -o "packages.vmware.com/photon" | wc -l` -eq 0 ]; then
      cd /etc/yum.repos.d/
      sed -i 's/dl.bintray.com\/vmware/packages.vmware.com\/photon\/$releasever/g' photon.repo photon-updates.repo photon-extras.repo photon-debuginfo.repo
    fi
    # install packages
    tdnf distro-sync -y
    tdnf install -y kpartx git bc build-essential createrepo_c texinfo wget python3-pip tar dosfstools cdrkit linux-secure-devel
    reboot

    This parts contains modification of a kernel 4.19.261-1.ph3-secure parameter. Works.

    # modify CONFIG_MEDIA_SUPPORT
    cp /usr/src/linux-headers-4.19.261-1.ph3-secure/.config /usr/src/linux-headers-4.19.261-1.ph3-secure/.config.0
    sed "s/# CONFIG_MEDIA_SUPPORT is not set/CONFIG_MEDIA_SUPPORT=y/" /usr/src/linux-headers-4.19.261-1.ph3-secure/.config.0 > /usr/src/linux-headers-4.19.261-1.ph3-secure/.config

    This part installs the latest Photon OS installer. Works.

    pip3 install docker==2.3.0
    cd /
    pip3 install git+https://github.com/vmware/photon-os-installer.git

    This is the code snippet to build Photon OS 4.0 and make fails on the step of nss-3.72 package.

    git clone -b 4.0 https://github.com/vmware/photon.git
    cd /photon
    make -j4 image IMG_NAME=iso
    # Fails Some packages failed:
    # ['nss-3.72']
    # Failed during building packages

The similar code snippet to build Photon OS 3.0 fails with could not find spec for createrepo/linux-dtb-ls1012afrwy/linux-dtb-rpi3/u-boot issue(s).

# git clone -b 3.0 https://github.com/vmware/photon.git
# cd /photon
# make -j4 image IMG_NAME=iso

Expected behavior

An actual recipe to build Photon OS iso image.

Additional context

No response

Vasavisirnapalli commented 1 year ago

@gpiyush-dev

sshedi commented 1 year ago

@dcasota can you share the nss.log from stage/LOGS/nss-<version> directory?

Also, can you share the build logs from tty when you build with make nss THREADS=4 LOGLEVEL=debug ?

dcasota commented 1 year ago

noooh, stupid me .. I've forgotten to increase RAM ... nss.log

btw. awesome update of photon-os-installer content !

sshedi commented 1 year ago

No problem. Yes, it's OOM error, please increase ram size and things should go back to normal. Have a nice weekend. Thanks.

dcasota commented 1 year ago

Hi @Vasavisirnapalli @sshedi,

From the actual docs only it is difficult to start configure a host environment with provisioned Photon OS to build a home ISO for Photon OS (latest).

The OOM issue has gone, but there were other issues depending on the host environment and specific time slice of the build. Actually they all are fixable - still running some build tests though.

My suggestion is to add a sub chapter "Building your home ISO for Photon OS latest". Therefore I've added a pull request #1376. Someone might review & defect it ?

I didn't take the step to add an additional description on how to configure a host environment with the latest build system e.g. Ubuntu 22.x, etc. For distinction purposes, the pull requests #1373 ++ were modified to start with "On Ubuntu,".

Hope it helps. Daniel

sshedi commented 1 year ago

Thanks @dcasota for the suggestions. We will take it, need to figure out few things but your changes in other PRs LGTM.

dcasota commented 1 year ago

Strange, during the building process nodejs-18.10.0 fails. nodejs.log output extract:

[2718/3775] cd ../../tools/v8_gypfiles; /usr/src/photon/BUILD/node-18.10.0/out/Release/mksnapshot --turbo_instruction_scheduling "--target_os=linux" "--target_arch=x64" --startup_src /usr/src/photon/BUILD/node-18.10.0/out/Release/obj/tools/v8_gypfiles/v8_snapshot.gen/snapshot.cc --embedded_variant Default --embedded_src /usr/src/photon/BUILD/node-18.10.0/out/Release/obj/tools/v8_gypfiles/v8_snapshot.gen/embedded.S --no-native-code-counters
FAILED: obj/tools/v8_gypfiles/v8_snapshot.gen/snapshot.cc obj/tools/v8_gypfiles/v8_snapshot.gen/embedded.S
cd ../../tools/v8_gypfiles; /usr/src/photon/BUILD/node-18.10.0/out/Release/mksnapshot --turbo_instruction_scheduling "--target_os=linux" "--target_arch=x64" --startup_src /usr/src/photon/BUILD/node-18.10.0/out/Release/obj/tools/v8_gypfiles/v8_snapshot.gen/snapshot.cc --embedded_variant Default --embedded_src /usr/src/photon/BUILD/node-18.10.0/out/Release/obj/tools/v8_gypfiles/v8_snapshot.gen/embedded.S --no-native-code-counters

<--- Last few GCs --->

<--- JS stacktrace --->

#
# Fatal javascript OOM in MemoryChunk allocation failed during deserialization.
#

/bin/sh: line 1: 28474 Trace/breakpoint trap   (core dumped) /usr/src/photon/BUILD/node-18.10.0/out/Release/mksnapshot --turbo_instruction_scheduling "--target_os=linux" "--target_arch=x64" --startup_src /usr/src/photon/BUILD/node-18.10.0/out/Release/obj/tools/v8_gypfiles/v8_snapshot.gen/snapshot.cc --embedded_variant Default --embedded_src /usr/src/photon/BUILD/node-18.10.0/out/Release/obj/tools/v8_gypfiles/v8_snapshot.gen/embedded.S --no-native-code-counters

[...]

[2725/3775] cc -MMD -MF obj/deps/openssl/openssl/crypto/cms/openssl.cms_env.o.d -DV8_DEPRECATION_WARNINGS -DV8_IMMINENT_DEPRECATION_WARNINGS -D_GLIBCXX_USE_CXX11_ABI=1 -DNODE_OPENSSL_CONF_NAME=nodejs_conf -DNODE_OPENSSL_HAS_QUIC -D__STDC_FORMAT_MACROS -DOPENSSL_NO_PINSHARED -DOPENSSL_THREADS -DOPENSSL_NO_HW -DOPENSSL_API_COMPAT=0x10100001L -DSTATIC_LEGACY -DNDEBUG -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_BUILDING_OPENSSL -DAES_ASM -DBSAES_ASM -DCMLL_ASM -DECP_NISTZ256_ASM -DGHASH_ASM -DKECCAK1600_ASM -DMD5_ASM -DOPENSSL_BN_ASM_GF2m -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DPADLOCK_ASM -DPOLY1305_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DX25519_ASM -DOPENSSL_PIC '-DMODULESDIR="/usr/src/photon/BUILD/node-18.10.0/out/out/Release/obj/lib/openssl-modules"' '-DOPENSSLDIR="/etc/ssl"' '-DENGINESDIR="/dev/null"' -DTERMIOS -I../../deps/openssl/openssl -I../../deps/openssl/openssl/include -I../../deps/openssl/openssl/crypto -I../../deps/openssl/openssl/crypto/include -I../../deps/openssl/openssl/crypto/modes -I../../deps/openssl/openssl/crypto/ec/curve448 -I../../deps/openssl/openssl/crypto/ec/curve448/arch_32 -I../../deps/openssl/openssl/providers/common/include -I../../deps/openssl/openssl/providers/implementations/include -I../../deps/openssl/config -I../../deps/openssl/config/archs/linux-x86_64/asm -I../../deps/openssl/config/archs/linux-x86_64/asm/include -I../../deps/openssl/config/archs/linux-x86_64/asm/crypto -I../../deps/openssl/config/archs/linux-x86_64/asm/crypto/include/internal -I../../deps/openssl/config/archs/linux-x86_64/asm/providers/common/include -pthread -Wall -Wextra -Wno-unused-parameter -m64 -Wa,--noexecstack -Wall -O3 -pthread -m64 -Wall -O3 -Wno-missing-field-initializers -Wno-old-style-declaration -O3 -fno-omit-frame-pointer   -c ../../deps/openssl/openssl/crypto/cms/cms_env.c -o obj/deps/openssl/openssl/crypto/cms/openssl.cms_env.o
ninja: build stopped: subcommand failed.
error: Bad exit status from /var/tmp/rpm-tmp.gljjMB (%build)

Any mksnapshot issue advice ?

dcasota commented 1 year ago

fyi Make of a Photon 4.0 x86_64 (generic flavor) iso finishes (photon-4.0-80a510a39.iso), but there are issues. It is not clear if they can be ignored:

Build system is Photon OS 4.0 5.10.152-2.ph4 on an Azure virtual machine Standard E4s v3 offer with 100GB disk space.

Main.log SpecDeps.log Serializable Spec objects.log SpecData.log PackageInfo.log PackageManager.log

The iso has been copied with Rufus 3.20 to an usb stick. Boot from the usb media stops with

error: bad shim signature.
error: you need to load the kernel first.

Press any key to continue...

cmdline configuration shows up as image

dcasota commented 1 year ago

Same result "bad shim signature" after make of Photon 4.0 x86_64 photon-4.0-8e2e32a39.iso and photon-4.0-66e4f0774.iso.

Is applying the recipe FAQ necessary?

This time I didn't restart on zero, but updated the build source.

cd /photon
git fetch
git merge origin/4.0
make -j8 image IMG_NAME=iso
dcasota commented 1 year ago

Any update on this?

Same result "bad shim signature" after make of Photon 4.0 x86_64 photon-4.0-37786894d.iso which included the following packages increment.

List of packages yet to be built...
{'linux-secure-5.10.152', 'libxkbcommon-1.4.1', 'font-util-1.3.2', 'libXfixes-5.0.3', 'util-macros-1.19.0', 'graphene-1.10.8', 'sysdig-0.27.0', 'at-spi2-core-2.45.91', 'libXinerama-1.1.5', 'GConf-3.2.5', 'cairo-1.17.2', 'libXfont2-2.0.3', 'libXdamage-1.1.5', 'libglvnd-1.4.0', 'libepoxy-1.5.10', 'atk-2.38.0', 'ktap-0.4', 'xorg-applications-7.7', 'libXi-1.7.4', 'mesa-22.1.1', 'libXScrnSaver-1.2.3', 'linux-aws-5.10.152', 'libXtst-1.2.3', 'systemd-247.11', 'linux-rt-5.10.152', 'proto-7.7', 'pango-1.41.1', 'gdk-pixbuf-2.42.0', 'libXcursor-1.2.1', 'gtk3-3.23.3', 'libXcomposite-0.4.5', 'ansible-posix-1.4.0', 'cloud-init-22.4.2', 'linux-5.10.152', 'xorg-fonts-7.7', 'gst-plugins-bad-1.17.1', 'gstreamer-plugins-base-1.17.1', 'shared-mime-info-2.2', 'ansible-2.12.7', 'gstreamer-1.17.1', 'ansible-community-general-5.6.0', 'linux-esx-5.10.152', 'libfontenc-1.1.2', 'rabbitmq-server-3.11.0', 'cups-2.2.7', 'falco-0.30.0', 'fribidi-1.0.9'}
dcasota commented 1 year ago

llvm make uses temporarily ~70GB disk space and, a screen message appeared at the iso build end concerning the 4gb splitting of the llvm .rpm-package (??) This doesn't happen anymore thanks to https://github.com/vmware/photon/commit/218634e3615ce80a94ac19a1b8694505537c0166 @michellew-vmware @gpiyush-dev !

edited: December 4th 2022 "bad shim signature" happens on the dev branch as well - photon-4.0-fa8025b15.iso. How to get it bakened? A positive detail: The console information during make is much more comprehensive by using a higher number of workers e.g. make -j64 image IMG_NAME=iso.

dcasota commented 1 year ago

Actually I'm stuck on 'bad shim signature' issue.

Not on 4.0, but on origin/dev branch, making an iso fails during Building of some packages (openjdk11, guile3, nodejs, powershell,...), see dev-build.txt. An issue solving recipe on this would be great as well!

dcasota commented 1 year ago

Why does make of these packages fail? Happens on a Ph4 Rev 2 machine for origin/5.0 as well . ['openjdk11-11.0.12', 'openjdk17-17.0.5', 'docker-19.03.15', 'guile3-3.0.8', 'powershell-7.3.0', 'nodejs-18.10.0']

michellew-vmware commented 1 year ago

Could you please some information with us?

  1. which commit do you use to build the iso?
  2. the whole build command you use?
  3. as for your local build vm, which kind of os do you use?

Thanks, Michelle

dcasota commented 1 year ago

Hi Michelle,

I'm testing several constellations.

Here the latest increment test build run information 3.) Linux ph01 5.10.159-3.ph4-secure #1-photon SMP Sat Jan 14 03:01:31 UTC 2023 x86_64 GNU/Linux Runs in a Azure vm Standard_e4s_v3 (4vcpus, 32GiB ram), genv2 2) make -j8 image IMG_NAME=iso THREADS=8
1) https://github.com/vmware/photon/commit/db473d664e4708eda8dbe85f9adbdd1449e62bc4

cd /photon
git fetch
git merge origin/5.0
make -j8 image IMG_NAME=iso THREADS=8

Standard_e4s_v3_genv2_5.10.159-3.ph4-secure_db473d6_202301171943.txt

michellew-vmware commented 1 year ago

Notify our dev to take a look. Will give you an update later.

sshedi commented 1 year ago

@dcasota use this and try make image IMG_NAME=iso THREADS=2

How much RAM & CPU cores have you assigned for your build machine? My suggestion is, use max available number of vCPU and 64GB vRAM.

Why does make of these packages fail? This info is insufficient. You should also share the logs for the failed packages from stage/LOGS directory.

Yes, there are few packages like k8s, openjdk which fail intermittently but restarting build will fix it.

sshedi commented 1 year ago

Or don't use threads at all if you are building on a resource limited system. Change threads: 1 here https://github.com/vmware/photon/blob/4.0/build-config.json#L6

dcasota commented 1 year ago

Thanks @michellew-vmware @sshedi

Here the logs: openjdk11-11.0.12 : openjdk11.log build-openjdk11-11.0.12.log

openjdk17-17.0.5 : openjdk17.log build-openjdk17-17.0.5.log

docker-19.03.15 : docker.log build-docker-19.03.15.log

guile3-3.0.8 : guile3.log build-guile3-3.0.8.log

powershell-7.3.0 : powershell.log build-powershell-7.3.0.log

nodejs-18.10.0 : nodejs.log build-nodejs-18.10.0.log

The Azure x86_64 system mentioned has been upgraded to Standard_E8s_v3 (8vcpu, 64 GiB ram). Rerun with make -j1 image IMG_NAME=iso THREADS=1. Same issue(s).

If I've understood you right, starting with low ram and with multiple threads, adding ram and lowering threads if needed, isn't actually a supported way because of a higher probability of non-fixable packages build even after reboot+restart. Successfully building 942 packages just means those packages are thread-safe. Threads=4 is recommended as limitation because the issue with higher threads is a well-known issue. Hence, there are 5 packages thread issues and 1 constellation detection ("non-fixable packages build even after reboot+restart") issue.

I'll start arm64 build test and share the findings.

edited: Spotted https://github.com/vmware/photon/commit/b066371b783c6905790aa5cc53d92c092335b7a6, https://github.com/vmware/photon/commit/5c7cb9c984e0c57fe04f6bdd86dd46e8ad885326, https://github.com/vmware/photon/commit/5c7cb9c984e0c57fe04f6bdd86dd46e8ad885326 . Many thanks for hanging in for this !

dcasota commented 1 year ago

On the Azure x86_64 build system mentioned, did a git merge of https://github.com/vmware/photon/commit/243bc11a35020dc4cc8de917ec968ede9e5258d0 AND tdnf remove -y linux-secure. Here some rerun findings.

Build on a Rpi4b of course is somewhat slow. Azure arm64 dpls_v5 quota requested (Central India is on hold, no answer yet from EastUS, not available in some regions). Azure Ubuntu 20.04 arm64 gen2 + secure boot spotted. No tests on aws+gce so far.

dcasota commented 1 year ago

x86_64 iso

Wohoo! first x86_64 Photon OS 5.0 beta iso build completed ! Started photon-5.0-243bc11a3.iso in esxi 8.0 :

ph5eula

Installation choice: 1) Photon Minimal 2)Photon Developer 3)Photon OSTree Host 4) Photon Real Time -> 'Photon Real time' selected

~After the installation reboot, splash screen 4.0 reappears and then black screen (no login prompt). The iso bits build are without https://github.com/vmware/photon/commit/3c7972e74f4ba971a19caed4bc15a8de80ef4b9e. Is this the reason ?~ Rebuild including the commit (photon-5.0-3c7972e74.iso) solved the issue.

aarch64 iso

Rpi4b aarch64 iso build still running.

Failed packages: autogen-5.18.16, zlib-1.2.11, perl-5.36.0, gc-8.2.2, cmake-3.25.1

Azure vhd

vhd make fails. First "vixDiskLib.h" was missed. Downloaded the vddk, copied the files to /photon/tools/src/vixDiskUtil. Now it fails with following issue message.

root@ph01 [ /photon ]# make -j1 image IMG_NAME=azure THREADS=1
Sanity check for all json files ...
Checking all python code is compilable ...

Creating staging folder and subitems...
make[1]: Verzeichnis „/photon/tools/src/vixDiskUtil“ wird betreten
g++ -o ../../bin/vixdiskutil -I/usr/include -L/usr/lib/vmware vixDiskUtil.cpp -ldl -lvixDiskLib -lvixMntapi -lvixDiskLibVim -lpthread -lssl -lfuse
/bin/ld: cannot find -lvixDiskLib
/bin/ld: cannot find -lvixMntapi
/bin/ld: cannot find -lvixDiskLibVim
/bin/ld: cannot find -lssl
/bin/ld: cannot find -lfuse
collect2: Fehler: ld gab 1 als Ende-Status zurück
make[1]: *** [Makefile:10: vix-disklib-util] Fehler 1
make[1]: Verzeichnis „/photon/tools/src/vixDiskUtil“ wird verlassen
ERROR: make -C /photon/tools/src/vixDiskUtil failed
Traceback (most recent call last):
  File "/photon/./build.py", line 1537, in main
    buildImage.build_image()
  File "/photon/./build.py", line 1117, in build_image
    build_vixdiskutil()
  File "/photon/./build.py", line 126, in build_vixdiskutil
    runShellCmd(f"make -C {photonDir}/tools/src/vixDiskUtil")
  File "/photon/support/package-builder/CommandUtils.py", line 41, in runShellCmd
    raise Exception(f"ERROR: {cmd} failed")
Exception: ERROR: make -C /photon/tools/src/vixDiskUtil failed
make: *** [Makefile:6: image] Fehler 1
root@ph01 [ /photon ]# 
dcasota commented 1 year ago

close ticket.

Make of Photon OS image fails quite often and this is normal. Actually, 5.0 beta on x86_64 and arm64 (2nd stage with 126 packages) build make worked well. Well, for the Photon OS team this is simply daily duties facing the tons of possible version bumps and security patches.

dcasota commented 1 year ago

fyi slipstreamed iso photon-minimal-4.0-1242e024e.iso (https://github.com/vmware/photon/commit/1242e024e13863cb3a43d978270625f758dda0b3) has double size (~851MB) and fails with photon installer 2.1 issue exception /bin/photon-installer, line 33 .