nubificus / bima

bima: OCI image builder for non-container software
Apache License 2.0
10 stars 0 forks source link

Extract target CPU architecture from unikernel binary #7

Closed gntouts closed 9 months ago

gntouts commented 1 year ago

Currently, we rely on a non-standard ARCH instruction provided by the user to determine the CPU architecture for the container image.

This could be extracted by reading the ELF header of the unikernel binary, using GetBinaryArchitecture().

To do that, we should iterate over every COPY instruction and find the one that matches the file defined from com.urunc.unikernel.binary LABEL.

ananos commented 1 year ago

Good idea, and implementation.

We will have to handle a few corner cases, such as a Unikraft Nginx image (not sure why it shows as 32bit):

$ file app-nginx_kvm-x86_64
app-nginx_kvm-x86_64: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped
$ file app-redis_kvm-x86_64
app-redis_kvm-x86_64: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped

but:

# file /store/tmp_uni_binaries/app-redis_kvm-x86_64 
/store/tmp_uni_binaries/app-redis_kvm-x86_64: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped

must have something to do with the target chosen in the unikraft config. In any case, let's assume that all x86 architectures are 64bit (amd64) and all ARM architectures are aarch64. I assume it's as simple as:

    case elf.EM_386:
        return "amd64"
    case elf.EM_X86_64:
        return "amd64"
    case elf.EM_ARM:
        return "arm64"
    case elf.EM_AARCH64:
        return "arm64"
ananos commented 9 months ago

it seems we cannot detect unikraft aarch64 binaries correctly (see https://github.com/nubificus/bima/pull/32). It seems the format used for some cases is PECOFF, so we need to account for this in the code.

cmainas commented 9 months ago

It seems bima expects the binary to be in ELF format, which is not always the case (see Unikraft on aarch64, as pointed by @ananos ). As a result, we should first determine the type of the executable file and then according to its format extract the CPU arch. In order to get the format of the binary we can use the filetype package. Then, we need to find a way to read Portable Executable (PE) format files to extract the CPU arch for the binary. I will have soon an implementation.