openbmc / bmcweb

A do everything Redfish, KVM, GUI, and DBus webserver for OpenBMC
Apache License 2.0
154 stars 131 forks source link

bmcweb hangs, strace says "too many open files" #195

Closed Kitsok closed 3 years ago

Kitsok commented 3 years ago

Describe the bug After some time in idle, bmcweb starts consuming CPU and stops responding to HTTPS requests. lsof says bmcweb opened /etc/ssl/certs/https/server.pem 1024 times strace says it tries to statx() /etc/ssl/certs/authority in a cycle, each time getting error "Too many open files"

Update: I've re-started bmcweb and started monitoring how it opens /etc/ssl/certs/https/server.pem. In second window I've opened tcpdump to monitor incoming requests. It opens /etc/ssl/certs/https/server.pem without requests to HTTPS each 10-30 seconds.

Environment AST2500, OpenBMC version is 2.10, latest commit is 17f8f05869000eedf2ea53274dc64b9313391595 bmcweb latest commit is 4df1bee0bdc9d71043b51872875d3d22b26ab68f

To Reproduce Steps to reproduce the behavior:

  1. Restart BMC
  2. Wait several hours

Quick fix Restart bmcweb by killing the process.

Kitsok commented 3 years ago

Dig shown several issues. Call stack is: include/hostname_monitor.hpp onPropertyUpdate() -> ensuressl::loadCert() -> BIO_new_file(filePath.c_str(), "rb"); certFileBio is then never closed.

Meanwhile onPropertyUpdate is called each 10-30 seconds and passed 3 changed properties: HostName, DefaultGateway, DefaultGateway6, so the suspect is phosphor-network-manager.

Quick and dirty fix

index 1b04af5..0adbc0c 100644
--- a/include/hostname_monitor.hpp
+++ b/include/hostname_monitor.hpp
@@ -31,6 +31,8 @@ inline void installCertificate(const std::filesystem::path& certPath)
         "xyz.openbmc_project.Certs.Replace", "Replace", certPath.string());
 }

+static std::string _hostname = "";
+
 inline int onPropertyUpdate(sd_bus_message* m, void* /* userdata */,
                             sd_bus_error* retError)
 {
@@ -59,6 +61,14 @@ inline int onPropertyUpdate(sd_bus_message* m, void* /* userdata */,
         return 0;
     }

+    if (_hostname == *hostname)
+    {
+        BMCWEB_LOG_DEBUG << " hostname didn't change, returning\n";
+        return 0;
+    }
+
+    _hostname = *hostname;
+
     BMCWEB_LOG_DEBUG << "Read hostname from signal: " << *hostname;
     const std::string certFile = "/etc/ssl/certs/https/server.pem";
Kitsok commented 3 years ago

More correct fix I suppose it also fixes memory leak

index 24e40a4..42a284a 100644
--- a/include/ssl_key_handler.hpp
+++ b/include/ssl_key_handler.hpp
@@ -199,6 +199,7 @@ inline X509* loadCert(const std::string& filePath)
         X509_free(cert);
         return nullptr;
     }
+    BIO_free(certFileBio);
     return cert;
 }
edtanous commented 3 years ago

@Alan-Kuo-TW Can you please take a look at the above and resolve? This looks like an issue with your commit a82207087d79c4dd85447bdacfd4de91be4e7166

edtanous commented 3 years ago

More correct fix I suppose it also fixes memory leak

index 24e40a4..42a284a 100644
--- a/include/ssl_key_handler.hpp
+++ b/include/ssl_key_handler.hpp
@@ -199,6 +199,7 @@ inline X509* loadCert(const std::string& filePath)
         X509_free(cert);
         return nullptr;
     }
+    BIO_free(certFileBio);
     return cert;
 }

This looks reasonable, if you could get it submitted to gerrit, I'm happy to approve and merge it.

Alan-Kuo-TW commented 3 years ago

More correct fix I suppose it also fixes memory leak

index 24e40a4..42a284a 100644
--- a/include/ssl_key_handler.hpp
+++ b/include/ssl_key_handler.hpp
@@ -199,6 +199,7 @@ inline X509* loadCert(const std::string& filePath)
         X509_free(cert);
         return nullptr;
     }
+    BIO_free(certFileBio);
     return cert;
 }

This looks reasonable, if you could get it submitted to gerrit, I'm happy to approve and merge it.

Thanks for your finding, I will push the new code to resolve it. https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/42330

edtanous commented 3 years ago

The above change has been merged. Please reopen if you are seeing similar issues.