flatcar / Flatcar

Flatcar project repository for issue tracking, project documentation, etc.
https://www.flatcar.org/
Apache License 2.0
678 stars 29 forks source link

New Package Request: snmpd #1105

Open jhaprins opened 1 year ago

jhaprins commented 1 year ago

Package name and purpose We want to use flatcar to build the underlay of a new to build K3S cluster. To monitor the underlay we need something to get metrics from the underlay. My first instinct would be to start monitoring with snmp, but snmpd is not installed / available.

Impact of adding this package to the Flatcar OS image SNMPd is a service that is well maintained and understood by the Linux administrator community. Though old, the software is still very well maintained and included in all relevant operating systems.

The package improves on the following core values:

The package will increase the image size by: 1 MBytes.

How might this package increase the attack surface:

Benefits of adding this package This package provides a standard way to monitor the telemetry of the physical machine.

Additional information [ Please add any information that does not fit into any of the above sections here ]

jepio commented 1 year ago

Have you tried running snmpd in a container?

jhaprins commented 1 year ago

snmpd in a container does not have access to the /proc and /sys filesystem of the physical system. There have been some hacks in the past were people tried to patch net-snmpd to mount the host /proc filesystem in the container and then modify the code of net-snmpd to access the mounted /proc filesystem, but this is more a hack then a real solution.(https://github.com/digiwhite1980/snmpd).

jepio commented 1 year ago

Ok, then this might be a good candidate for an optional sysext.

cc @krishjainx @pothos

krishjainx commented 1 year ago

Hi @jhaprins. I've been working on a tool to generate systemd-sysexts (https://github.com/flatcar/scripts/commit/eb2f3d543dd6cee7d2520306d152fbf674cb2efc). You should be able to follow the following workflow and install snmpd:

In the SDK container

./build_packages
./build_image
git clone --depth 5 https://www.github.com/gentoo/gentoo
cp -R ./gentoo/net-analyzer/net-snmp  ../third_party/portage-stable/net-analyzer/net-snmp
emerge-amd64-usr  net-analyzer/net-snmp # replace with arm64-usr if compiling for that architecture
sudo ./build_sysext  --board=amd64-usr --image_builddir=images snmpd net-analyzer/net-snmp

In your build directory you should see snmpd.raw. You could upload this to a remote server and then do something like this for your butane config:

variant: flatcar
version: 1.0.0
storage:
  files:
    - path: /etc/extensions/snmpd.raw
      mode: 0644
      contents:
        source: https://myserver.com/snmpd.raw

and transpile it to an ignition configuration to use (https://www.flatcar.org/docs/latest/provisioning/config-transpiler/)

pothos commented 1 year ago

snmpd in a container does not have access to the /proc and /sys filesystem of the physical system

Would something like --privileged --pid=host --security-opt="seccomp=unconfined" -v /proc/:/prov -v /sys:/sys work? (I think the -vs maybe aren't needed.)

For the systemd-sysext image built in the above way you would have to rebuild it for every Flatcar release, so you have to disable auto-updates to be able to update both at the same time.

We don't have a good way to build independent systemd-sysext images with a prefix like /opt or /usr/local yet but some examples with static binaries are in https://github.com/flatcar/sysext-bakery

jhaprins commented 1 year ago

krishjainx,

When trying to build I get a few errors on the first command, is this normal?

sdk@flatcar-sdk-all-3510_0_0_os-stable-3510_2_4-nightly-20230706-21 ~/trunk/src/scripts $ ./build_packages INFO update_chroot: Setting up portage... find: ‘/mnt/host/source/config/portage/repos’: No such file or directory INFO update_chroot: Setting up crossdev... INFO update_chroot: Updating chroot: INFO update_chroot: chroot version: 3510.0.0 INFO update_chroot: Flatcar version: 3510.2.4+nightly-20230706-2100 INFO update_chroot: Updating basic system packages

Performing Global Updates (Could take a couple of minutes if you have a lot of binary packages.)

!!! Error fetching binhost package info from 'https://mirror.release.flatcar-linux.net/sdk/amd64/3510.0.0/toolchain' !!! HTTP Error 404: Not Found

!!! Error fetching binhost package info from 'https://mirror.release.flatcar-linux.net/sdk/amd64/3510.0.0/pkgs' !!! HTTP Error 404: Not Found

Jobs: 0 of 0 complete Load avg: 9.9, 12.2, 12.8

  • Switching native-compiler to x86_64-pc-linux-gnu-11 ... [ ok ] INFO update_chroot: Updating cross x86_64-cros-linux-gnu toolchain

!!! Error fetching binhost package info from 'https://mirror.release.flatcar-linux.net/sdk/amd64/3510.0.0/toolchain' !!! HTTP Error 404: Not Found

!!! Error fetching binhost package info from 'https://mirror.release.flatcar-linux.net/sdk/amd64/3510.0.0/pkgs' !!! HTTP Error 404: Not Found

Doing a full bootstrap via crossdev

krishjainx commented 1 year ago

Yup, that is normal @jhaprins . You could switch to the latest alpha and then ./build_packages would just pull in the binary packages instead of building it locally

jhaprins commented 1 year ago

I have checked out the latest stable because I wanted to make sure that the things I build are compatible with the version running on the hosts. But I'm also wandering what will happen. Do I need to rebuild the systemd-extents every time an update is being released? But anyway, it is building now and I will see when it is finished. I have all the time.

krishjainx commented 1 year ago

Currently you do have to rebuild the systemd-sysext every time an update is released (we're looking to change this on the update side). However, if you have a static binary then it should not need to be rebuilt. I think net-snmp can be built statically, if it isn't already you could modify the ebuild here: ../third_party/portage-stable/net-analyzer/net-snmp

jhaprins commented 1 year ago

@krishjainx,

took a few days because of jumping through hoops that were not expected etc. But now I'm halfway your recipe and now I'm failing on the build_image script.

` sdk@flatcar-sdk-all-3654_0_0_os-alpha-3654_0_0 ~/trunk/src/scripts $ ./build_image INFO build_image: Checking build root INFO build_image: Checking /build/amd64-usr This system is affected by the following GLSAs: INFO build_image: Building production image flatcar_production_image.bin INFO build_image: Using image type base start size part contents 0 1 Hybrid MBR 1 1 Pri GPT header 2 32 Pri GPT table 4096 262144 1 Label: "EFI-SYSTEM" Type: EFI System Partition UUID: 8BDDB111-FEBD-4122-96E0-853D7755003D Attr: Legacy BIOS Bootable 266240 4096 2 Label: "BIOS-BOOT" Type: BIOS Boot Partition UUID: DD93222E-87F2-4262-9200-1F90610C78B5 270336 2097152 3 Label: "USR-A" Type: Alias for coreos-rootfs UUID: 7130C94A-213A-4E5A-8E26-6CCE9662F132 Attr: priority=1 tries=0 successful=1 2367488 2097152 4 Label: "USR-B" Type: Alias for coreos-rootfs UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A57C Attr: priority=0 tries=0 successful=0 4464640 262144 6 Label: "OEM" Type: Alias for linux-data UUID: 4DEA145C-B04A-46A1-BA26-C5E3969FAA8B 4726784 131072 7 Label: "OEM-CONFIG" Type: CoreOS reserved UUID: FC310987-6FFE-4C5A-A100-FBAAA1C2066F 4857856 4427776 9 Label: "ROOT" Type: CoreOS auto-resize UUID: 0D767DE9-7E92-428B-BB28-0F6443CEC95E 9289695 32 Sec GPT table 9289727 1 Sec GPT header Formatting partition 1 (EFI-SYSTEM) as vfat Formatting partition 3 (USR-A) as ext2 Formatting partition 6 (OEM) as btrfs btrfs-progs v6.0.2 See http://btrfs.wiki.kernel.org for more information.

Performing full device TRIM /dev/loop0 (128.00MiB) ... NOTE: several default settings have changed in version 5.15, please make sure this does not affect your deployments:

Label: OEM UUID: 6c6a8424-09b6-4e92-99ee-adb187a76b85 Node size: 4096 Sector size: 4096 Filesystem size: 128.00MiB Block group profiles: Data+Metadata: single 8.00MiB System: single 4.00MiB SSD detected: yes Zoned device: no Incompat features: mixed-bg, extref, skinny-metadata, no-holes Runtime features: free-space-tree Checksum: crc32c Number of devices: 1 Devices: ID SIZE PATH 1 128.00MiB /dev/loop0

WARNING: failed to open /dev/btrfs-control, skipping device registration: No such file or directory mount: /tmp/tmpyz8b0k8g: unknown filesystem type 'btrfs'. dmesg(1) may have more information after failed mount system call. Traceback (most recent call last): File "/mnt/host/source/src/scripts/build_library/disk_util", line 1127, in main(sys.argv) File "/mnt/host/source/src/scripts/build_library/disk_util", line 1123, in main options.func(options) File "/mnt/host/source/src/scripts/build_library/disk_util", line 505, in Format FormatPartition(options, part) File "/mnt/host/source/src/scripts/build_library/disk_util", line 480, in FormatPartition FormatBtrfs(part, loop_dev) File "/mnt/host/source/src/scripts/build_library/disk_util", line 374, in FormatBtrfs Sudo(['mount', '-t', 'btrfs', device, btrfs_mount]) File "/mnt/host/source/src/scripts/build_library/disk_util", line 326, in Sudo subprocess.check_call(['sudo'] + [str(c) for c in cmd], stdout=null) File "/usr/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', 'mount', '-t', 'btrfs', '/dev/loop0', '/tmp/tmpyz8b0k8g']' returned non-zero exit status 32. ERROR build_image: script called: build_image ERROR build_image: Backtrace: (most recent call is last) ERROR build_image: file build_image, line 178, called: create_prod_image 'flatcar_production_image.bin' 'base' 'developer' 'coreos-base/coreos' ERROR build_image: file prod_image_util.sh, line 83, called: start_image 'flatcar_production_image.bin' 'base' '/mnt/host/source/src/build/images/amd64-usr/developer-3654.0.0+2023-07-11-1614-a1/rootfs' 'developer' ERROR build_image: file build_image_util.sh, line 587, called: die_err_trap '"${BUILD_LIBRARY_DIR}/disk_util" --disk_layout="${disk_layout}" format "${disk_img}"' '1' ERROR build_image: ERROR build_image: Command failed: ERROR build_image: Command '"${BUILD_LIBRARY_DIR}/disk_util" --disk_layout="${disk_layout}" format "${disk_img}"' exited with nonzero code: 1 ERROR build_image: (Note bash sometimes misreports "command not found" as exit code 1 instead of 127) sdk@flatcar-sdk-all-3654_0_0_os-alpha-3654_0_0 ~/trunk/src/scripts $ ` Is there a quick fix for this problem?

Trying to build the latest alpha now because this has the script that I need to create the sysect.

Jan Hugo

jhaprins commented 1 year ago

Searching a bit on the Interwebs tells me that BTRFS support has been dropped from some Linux distributions (RHEL from 7.4 onwards). Installed an AlmaLinux 8.7 on the build system I quickly created. This might be the issue.

krishjainx commented 1 year ago

Hmm, would you like to ask me on matrix (you can use element). I am @krishjain on it. It would be easier to debug and I don't want to create noise here

On Tue, Jul 11, 2023, 22:03 Jan Hugo Prins @.***> wrote:

Searching a bit on the Interwebs tells me that BTRFS support has been dropped from some Linux distributions. Installed an AlmaLinux 8.7 on the build system I quickly created. This might be the issue.

— Reply to this email directly, view it on GitHub https://github.com/flatcar/Flatcar/issues/1105#issuecomment-1631139356, or unsubscribe https://github.com/notifications/unsubscribe-auth/AR4RDLI34TGU7EZPDCH7THTXPV555ANCNFSM6AAAAAAZ5QEUKY . You are receiving this because you were mentioned.Message ID: @.***>

krishjainx commented 1 year ago

Using a distribution with BTRFS support like Oracle Linux with UEK seems to have solved this. @jhaprins could you please update us on your progress?

jhaprins commented 1 year ago

In the latest Alpha versions of Flatcar I have been able to build the snmpd.raw sysext image. And indeed, the main issue I was having is that I needed a kernel with BTRFS support.

jhaprins commented 1 year ago

Getting the whole sysext tooling chain into stable would be really great. To actually start the snmpd.service in systemd I have added the following to my ignition file for the nodes:

systemd:
  units:
    - name: snmpd.service
      dropins:
        - name: 10-startup.conf
          contents: |
             [Service]
             Type=simple
             Environment=OPTIONS="-LS0-6d"
             EnvironmentFile=-/etc/sysconfig/snmpd
             ExecStart=
             ExecStart=/usr/sbin/snmpd $OPTIONS -f
             ExecReload=/bin/kill -HUP $MAINPID
    - name: multi-user.target
      dropins:
        - name: 10-snmpd.conf
          contents: |
            [Unit]
            Upholds=snmpd.service
storage:
  files:
    - path: /etc/sysconfig/snmpd
      contents:
        inline: |
          OPTIONS="-LS4d -Lf /dev/null -p /var/run/snmpd.pid -a"
    - path: /etc/extensions/snmpd.raw
      mode: 0644
      contents:
        source: http://ks/flatcar/3665/snmpd.raw
    - path: /etc/snmp/snmpd.conf
      mode: 0640
      contents:
        source: http://ks/flatcar/3665/snmpd.conf