containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.27k stars 2.37k forks source link

linuxserver.io Custom Shell Script leaves trailing spaces during apk add command #20778

Closed Zanathoz closed 10 months ago

Zanathoz commented 10 months ago

Issue Description

I'm trying to launch a custom script with the Nextcloud linuxserver.io container to install the dlib, pdlib and bzip2 APKs at launch to support the Face Recognition application, and Podman appears to be leaving a trailing space at the end of the shell lines which may be causing errors with it running.

As a workaround I am able to manually run these commands and get the components installed in the container, and restart the php-fpm service to get Face Recognition working properly.

Steps to reproduce the issue

Create custom script to install pre-requisit components during container launch:

echo "*** Executing Custom Script face_recognition.sh ***"
echo "*** Installing php8-pdlib (facerecognition dependency) ***"
apk add dlib
apk add php82-pdlib
echo "*** installing bzip (facerecognition dependency) ***"
apk add bzip2-dev
echo "*** facerecognition script complete ***"

Attach shared volume with launchscript.sh as -v /your_volume:/custom-cont-init.d:ro

Launch Container and receive errors

Describe the results you received

It appears Podman is appending a trailing space which is throwing off the apk commands (logs show in reverse timestamp from how I copied over in cockpit):

[ls.io-init] done.
[custom-init] face_recognition.sh: exited 127
*** facerecognition script complete ***
ERROR: unable to select packages:
**required by: world[bzip2-dev ]**
bzip2-dev (no such package):
ERROR: unable to select packages:
*** installing bzip  ***
**required by: world[php82-pdlib ]**
php82-pdlib (no such package):
ERROR: unable to select packages:
**required by: world[dlib ]**
dlib (no such package):
ERROR: unable to select packages:
fetch http://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
*** Installing php8-pdlib  ***
*** Executing Custom Script face_recognition.sh ***
[custom-init] face_recognition.sh: executing...
[custom-init] Files found, executing

Describe the results you expected

The APK components to be installed properly during container launch.

podman info output

host:
  arch: amd64
  buildahVersion: 1.32.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 94.92
    systemPercent: 2.17
    userPercent: 2.91
  cpus: 8
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: server
    version: "38"
  eventLogger: journald
  freeLocks: 1785
  hostname: DJ-ContainerStation
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.5.11-200.fc38.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 1255976960
  memTotal: 12526350336
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.8.0-1.fc38.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.8.0
    package: netavark-1.8.0-2.fc38.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.8.0
  ociRuntime:
    name: crun
    package: crun-1.11-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.11
      commit: 11f8d3dc9fc4bb8a0adcff5ba8bd340f24612701
      rundir: /run/user/0/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20231004.gf851084-1.fc38.x86_64
    version: |
      pasta 0^20231004.gf851084-1.fc38.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-1.fc38.x86_64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 6181351424
  swapTotal: 8589930496
  uptime: 79h 15m 11.00s (Approximately 3.29 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
  - lscr.io
  - ghcr.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 28
    paused: 0
    running: 27
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 153677201408
  graphRootUsed: 89460838400
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "true"
    Supports volatile: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 28
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.7.2
  Built: 1698762611
  BuiltTime: Tue Oct 31 10:30:11 2023
  GitCommit: ""
  GoVersion: go1.20.10
  Os: linux
  OsArch: linux/amd64
  Version: 4.7.2

[root@DJ-ContainerStation ~]#

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

Additional environment details

VMware ESXi 7.0U3 with Fedora 38 VM on latest patches and updated Podman

Additional information

None.

giuseppe commented 10 months ago

podman doesn't change the file content when used through a bind mount:

$ sha256sum /tmp/script.sh; podman run --rm -v /tmp/script.sh:/tmp/script.sh fedora sha256sum /tmp/script.sh
efa87b25c29f18e2a44990444d7140e9fe6b81146b841157396284ed61f81e61  /tmp/script.sh
efa87b25c29f18e2a44990444d7140e9fe6b81146b841157396284ed61f81e61  /tmp/script.sh

can you check if the original file already contains the extra trailing spaces?

edsantiago commented 10 months ago

This is not a podman issue. apk seems to handle trailing whitespace just fine:

$ bin/podman run -it --rm docker.io/linuxserver/nextcloud:latest sh
[blah blah]
# apk update
[blah]
# apk add "dlib    "
ERROR: unable to select packages:
  dlib (no such package):
    required by: world[dlib]

I suspect the trailing whitespace is an artifact of your logging setup (which the reverse-order thing already confirms is weird). It is a red herring. I don't know how apk works or how you add repos, but, you need to add a repo containing the dlib package.

Zanathoz commented 10 months ago

@giuseppe - I checked the original file when I noticed the trailing space in the Cockpit logs which is the standard plugin for Fedora when using cockpit-podman plugin, and they are not present. As edsantiago showed, apk doesn't seem to care about them anyway. The shell file I'm using is mounted with the -v command already to the appropriate location linuxserver.io calls out for customization, and even the apk update command is failing for me.

@edsantiago - The logs were copied in reverse order as they populate in the web GUI as they come in, I was just stating that as I hadn't changed them to be symmetrical. Additionally, the dlib, pdlib and bzip-dev2 packages are already presented in their public testing repo which does not need added to the container after it's been launched. I did also try appending this repo to the repo file at launch which resulted in the same issue reported above.

I can try spinning up a brand new Fedora server and try to replicate just to rule out if it's an issue with the VM I'm using. I'll also share your updates to the linuxserver.io team, but they state no support for podman in their discord already so I'm doubtful I'll get anywhere with this one.

edsantiago commented 10 months ago

@Zanathoz thanks for the followup. How do you envision the Podman team helping you? We don't have a reproducer, we don't even have any clues as to what you're doing. The only hint you mention is "Nextcloud linuxserver.io", which I (perhaps wrongly) interpreted as docker.io/linuxserver/nextcloud:latest, which as I show above has no dlib package. You mention "packages are already presented in their public testing repo" but please keep in mind that we have no idea what you're talking about - neither I nor (probably) anyone on the Podman team has any familiarity with this particular container image or its environment. The only thing we can offer you is reassurance that Podman does not insert arbitrary whitespace. I encourage you to pursue the no such package errors.

Zanathoz commented 10 months ago

Thanks @edsantiago - I supposed I wanted to make sure Podman wasn't causing my issue since the linuxserver.io team states they don't officially support Podman. I have many of their containers successfully running already in Podman, but this is the first time I've had to try and utilize their container customization (https://www.linuxserver.io/blog/2019-09-14-customizing-our-containers). I am leaning towards a potential Podman issue however, as I can't even run a simple "apk update" command without it erroring out during container startup. I don't think that's included in my example above but I can easily re-create if needed.

I have also addressed permissions on the script, running it as 777 just to see if that made any difference.

My reason for needing this is that their nextcloud container (docker.io/linuxserver/nextcloud:latest as you mentioned) is built on alpine, and their alpine build does not include dlib, pdlib, or bzip2-dev requirements needed for the Nextcloud Face Recognition application (https://github.com/matiasdelellis/facerecognition/wiki/Installation). Docker installation notes can be found here (https://github.com/matiasdelellis/facerecognition/wiki/Docker).

The dlib, pdlib and bzip2-dev APK files are kept on Alpine's public "edge/testing" repository that should not need added to the repository listing, and that can be proved true as I can install them directly in the container with the commands my script is running, or the commands used in the Docker installation method's mentioned in the face recognition github Wiki linked above.

Apologies for not being thorough in my initial description, I had burned about two full days in troubleshooting this issue myself before getting it working with my manual fix of installing those packages at each container startup. I'd much rather automate it with the methods mentioned above if possible though!

I do have a discord issue raised with the linuxserver.io team too. I'm hoping between that and this github issue we can figure out what the problem is with getting this script to run in my Fedora 38 podman environment.

Zanathoz commented 9 months ago

Just to update this issue and hopefully help anyone else in this situation, I am using this script to get the job done. I need to change it to run at system startup via crontab or the nextcloud.service I have setup, but in testing this fully automates the installation of the components into the container from the host server.

Nextcloud is dependent on it's mariaDB being up or the server won't start properly, hence we restart the DB container prior to starting Nextcloud. I also have my Nextcloud container set to sleep for 5 minutes before starting the container to ensure the DB container has initialized. The DB container initializes fairly quick, but I have many containers on this server and after rebooting from system updates I wanted to ensure the DB container had enough time to initialize. The 'depends-on' flag hasn't worked reliably for me in the past but the sleep timer has.

I also have a crontab running every 15 minutes to kick off Face Recognition, so I have my crontab flag set to run for 14 minutes (840 seconds in the final line).

I should also mention in case anyone notices that I have the APK files downloaded and shared via volume mount to the container. This is just a dirty job for now. These should be updated to the appropriate public links provided in the Face Recognition github.

#Disable Nexcloud Container
systemctl disable container-nextcloud.service
echo Nextcloud Container Disabled
#Restart Nextcloud_DB Container
systemctl restart container-nextcloud_db.service
echo Nextcloud_DB Container Restarted
#Enable Nextcloud Container Service
systemctl enable container-nextcloud.service
echo Nextcloud Container Enabled
#Start Nextcloud Container
systemctl start container-nextcloud.service
echo Nextcloud Container Started
#Wait (sleep) 10 minutes
echo Waiting 10 minutes
sleep 600
#Install PDLib inside nextcloud container
echo Installing DLib
podman exec -it nextcloud apk add /apk/dlib-19.24.2-r0.apk
echo Installing PDLib
podman exec -it nextcloud apk add /apk/php82-pdlib-1.1.0-r0.apk
echo Installing bzip2-dev
podman exec -it nextcloud apk add bzip2-dev
echo Installing blas
podman exec -it nextcloud apk add blas
#Restart PHP-FPM within Nextcloud Container
echo Restarting PHP in Nextcloud Container
pkill -o -USR2 php-fpm
echo PHP Restarted in Nextcloud Container
#Start Face Recognition
echo Start Background Job
podman exec -it nextcloud occ face:background_job -t 840
echo Script Completed Successfully!!