Closed Jonathan-Caruana closed 4 months ago
@jrafanie Can you please take a look here?
@Jonathan-Caruana - Do you have any logs from this appliance that you can share? Do you have any idea what these processes were doing, such as could they be smart state analysis or ansible runs?
Interesting report here:
https://github.com/net-ssh/net-ssh/issues/557 fixed in net-ssh 5.0.0 (we're using 4.2.0 in the bug report) https://github.com/net-ssh/net-ssh/pull/580
If we know what the generic worker was doing before it started to create defunct processes, we could determine if it's this issue or something else that's occurring.
@Jonathan-Caruana Does this happen all the time? Can you grep the process that's creating the defunct processes before and after it's doing it to see what type of work it's processing?
To see how to review the logs: https://www.manageiq.org/docs/reference/latest/troubleshooting/
If you grep just the problematic process id in the logs, and narrow down on just that process, you might be able to determine what it's doing that's creating these defunct processes.
We suspect it's net-ssh during host/vm scanning or perhaps from running ansible playbooks but narrowing it down can help us fix it. Thanks!
Hello @jrafanie @Fryguy
Sorry for the delay.
I think i have find out what's happened. We have shutdown since weeks a KVM host declared in ManageIQ as a Provider. I guess it's because it cant be reach that all those process stuck and piled up. I'm not sure but i have removed this host from providers list, restart evmserverd and now i don't have this behaviour anymore.
Maybe it's not an desired behaviour when ManageIQ can't reach a provider but at least i were able to end up this situation.
Regards,
This issue has been automatically marked as stale because it has not been updated for at least 3 months.
If you can still reproduce this issue on the current release or on master
, please reply with all of the information you have about it in order to keep the issue open.
Closing. If you are able to recreate and provide steps, we're happy to reopen and determine a fix. Thanks!
Hello,
Our monitoring report too many process on our ManageIQ instance and a simple 'ps" let show huge amount of 'sss_ssh_knownho' child process. The main process is "MIQ: MiqGenericWorker id: 1116284, queue: generic"-
For now, there is 700 child process like displayed in the previous capture.
Version
gem env
bundle env
Ignoring ffi-1.15.0 because its extensions are not built. Try: gem pristine ffi --version 1.15.0 Ignoring psych-3.3.1 because its extensions are not built. Try: gem pristine psych --version 3.3.1
Environment
Bundler 2.4.18 Platforms ruby, x86_64-linux Ruby 3.0.4p208 (2022-04-12 revision 3fa771ddedac25560be57f4055f1767e6c810f58) [x86_64-linux] Full Path /usr/bin/ruby Config Dir /etc RubyGems 3.2.33 Gem Home /opt/manageiq/manageiq-gemset Gem Path /opt/manageiq/manageiq-gemset:/usr/share/gems:/usr/local/share/gems User Home /root User Path /root/.local/share/gem/ruby Bin Dir /opt/manageiq/manageiq-gemset/bin Tools Git 2.39.3 RVM not installed rbenv not installed chruby not installed
Bundler Build Metadata
Built At 2023-08-02 Git SHA d2e3d8e3f4 Released Version true
Bundler settings
build.rugged Set for your local app (/var/www/miq/vmdb/.bundle/config): "--with-ssh" gemfile Set via BUNDLE_GEMFILE: "/var/www/miq/vmdb/Gemfile" jobs Set for your local app (/var/www/miq/vmdb/.bundle/config): 4 retry Set for your local app (/var/www/miq/vmdb/.bundle/config): 3 with Set for your local app (/var/www/miq/vmdb/.bundle/config): [:appliance, :qpid_proton, :systemd] without Set for your local app (/var/www/miq/vmdb/.bundle/config): [:development, :test]
ruby -v
ruby 3.0.4p208 (2022-04-12 revision 3fa771dded) [x86_64-linux]
Regards,