voxpupuli / puppet-systemd

Puppet module to manage systemd
https://forge.puppet.com/puppet/systemd
Apache License 2.0
54 stars 142 forks source link

Udev fact exceeds the number of facts limit #314

Closed bitcrush closed 1 year ago

bitcrush commented 1 year ago

The new udev fact introduced in v4.0.0 of the module seems to create a high number of nested facts, which on my nodes exceeds the default limit of 2048 by a lot. Is this actually intended behaviour?

Info: Loading facts
Warning: The current total number of facts: 11210 exceeds the number of facts limit: 2048
jhoblitt commented 1 year ago

Yes, this was anticipated. The fact warning threshold Is fairly arbitrary and my production envs were over the default value without this fact. Is a blurb in the README a needed?

bitcrush commented 1 year ago

Thank you for the clarification. I think a small note in the README would be reasonable as the fact alone seems to exceed the soft limit on most Linux systems.

jhoblitt commented 1 year ago

I'll add a blurb to the docs. Do you have a sense of how many "facts" this change added to your nodes and what the total was prior?

bitcrush commented 1 year ago

Thanks! The total prior adding the fact was 755. So the number added by it would be 10455.

jhoblitt commented 1 year ago

Thank you. If this fact ends up causing problems we can probably do something like add an external fact that can be used as a switch to turn it on and off.

ekohl commented 1 year ago

Should this fact rather be opt-in then?

jhoblitt commented 1 year ago

The pre-merge discussion was a long term goal of to try to move this fact into core facter. I don't consider the agent's fact limit warning a problem per se. I suspect most sites have > 2k facts and have adjusted and/or are ignoring the warning. I'd rather not preemptively disable the fact unless causes real world problems. My largest concern is the fact might be slow (> 1s) on some systems.

ekohl commented 1 year ago

My biggest concern is systems which store facts in their database, such as PuppetDB and Foreman may take a large hit as well.

jhoblitt commented 1 year ago

I increased the max number of facts in foreman to capture all the data. I'm also injecting the fact into the foreman discovery image.

antaflos commented 1 year ago

This fact (feature) should definitely be opt-in. We managed to kill our Foreman instance multiple times with errors related to pushing facts from Puppet servers to Foreman once we updated puppet-systemd to 4.0.0. The udev facts seem to get particularly numerous on bare metal hosts (Dell PowerEdge R6525 in this case) and resulted in scary looking error logs like these in Foreman's production.log:

2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty50::path could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty50::name could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty50::property::DEVNAME could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty50::property::DEVPATH could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty50::property::MAJOR could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty50::property::MINOR could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty50::property::SUBSYSTEM could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty51::path could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty51::name could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty51::property::DEVNAME could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty51::property::DEVPATH could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty51::property::MAJOR could not be imported because of Validation failed: Fact name has already been taken
2023-01-30T11:32:55 [E|app|efebb1b1] Fact udev::path::/devices/virtual/tty/tty51::property::MINOR could not be imported because of Validation failed: Fact name has already been taken

And the push facts script on the Puppet servers also complained:

During fact upload occured an exception: Net::ReadTimeout
Could not send facts to Foreman: Net::ReadTimeout

It also resulted in Foreman adding that particular Dell server multiple times to its list of hosts:

kvmhost11-duplicated-blurred

After downgrading puppet-systemd and manually cleaning up all of the facts and reports collected on the Puppet servers it now seems to run stable again.

So yeah, please make this udev fact feature optional.

jhoblitt commented 1 year ago

@antaflos Ugh. I think we will need to do a major version bump to back it out.

Out of curiosity, what version of foreman are you running? Could you share a udevdb dump from one of the R6525s to add to the test fixtures?

antaflos commented 1 year ago

@jhoblitt We are on Foreman 3.4.1 (on Ubuntu 20.04), so more or less up to date, I think. Of course I can provide a udev dump; do you have a command in mind I should run? Something like udevadm info --export-db?

jhoblitt commented 1 year ago

@antaflos udevadm info -e > foo.txt should be sufficient.

There is still discussion ongoing on slack as to if we are going to disable this fact or remove it completely. Hopefully, we will get a patch release out today or tomorrow.

antaflos commented 1 year ago

Ok @jhoblitt, here you go:

udevdump.txt

jhoblitt commented 1 year ago

The fixed will be included in the #317 release.