ouster-lidar / ouster-sdk

Ouster, Inc. sample code
Other
469 stars 438 forks source link

Why does `get_metadata()` fail when running in a debugger, but succeeds otherwise? #628

Open themightyoarfish opened 5 hours ago

themightyoarfish commented 5 hours ago

Describe your question

This is a very interesting problem I have never had, and I don't even know how to begin debugging it.

Our client code calls get_metadata() for a number of times, since it often returns an empty string for the first n attempts.

    // For unknown reasons, the first attempt at `get_metadata()` (a HTTP api
    // request) often aborts in the SDK and we get an empty string. Trying again
    // right away often helps, so we do just this
    std::string meta{};
    const size_t num_retries = config.get<int>("fetch_metadata_num_retries", 5);
    const size_t fetch_timeout = config.get<int>("fetch_metadata_timeout_s", 2);
    for (size_t connect_attempt = 0;
         connect_attempt < num_retries && meta.empty();
         connect_attempt++) {
      try {
        LOG_INFO(Logger::SENSOR,
                 __func__ << ": get_metadata attempt number "
                          << connect_attempt + 1);
        meta = sensor::get_metadata(*cli_tmp, fetch_timeout);
      } catch (const std::runtime_error& e) {
        LOG_INFO(
            Logger::SENSOR, __func__ << ": Failed fetching sensor metadata.");
      } catch (const std::invalid_argument& e) {
        LOG_INFO(
            Logger::SENSOR, __func__ << ": Failed parsing sensor metadata.");
      }
    }

    if (meta.empty()) {
      throw std::runtime_error(
          "Fetching or parsing sensor info json failed. Often, this indicates "
          "a network "
          "communication problem or that the sensor is currently starting");
    }

This works mostly, but now when I execute the same program in lldb, all attempts fail, no matter how often I try.

I recently upgraded the OS, but I don't know if Apple's new llvm version can have anything to do with this. How could I begin troubleshooting?

Platform (please complete the following information):

themightyoarfish commented 5 hours ago

I'm suspecting this is some fuckup of Apple's that haunts me now that I've upgraded the OS.

themightyoarfish commented 5 hours ago

On a related note, why does get_metadata() so often just return an empty string?

themightyoarfish commented 5 hours ago

When I use this code to do the HTTP request directly

    auto sensor_http =
        ouster::sensor::util::SensorHttp::create("os-122307000738.local", 10);
    auto info = sensor_http->sensor_info(10);

I receive

libc++abi: terminating due to uncaught exception of type std::runtime_error: CurlClient::execute_request failed for the url: [http://os-122307000738.local/api/v1/system/firmware] with the error message: Couldn't connect to server

When running in debugger, but the program runs without it.

Meanwhile, curl on the command line works too:

curl --request GET --url http://os-122307000738.local/api/v1/sensor/metadata/sensor_info

It seems that under lldb, I cannot make network connections?

themightyoarfish commented 4 hours ago

Also does not seem to be related to debugging entitlements. Running this has no effect

codesign -s - -v -f --entitlements =(echo -n '<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "https://www.apple.com/DTDs/PropertyList-1.0.dtd"\>
<plist version="1.0">
    <dict>
        <key>com.apple.security.get-task-allow</key>
        <true/>
    </dict>
</plist>') <program>
themightyoarfish commented 4 hours ago

Update: This seems to be a problem with apple lldb. Homebrew lldb via /opt/homebrew/Cellar/llvm/18.1.8/bin/lldb in my case works 🤡

themightyoarfish commented 4 hours ago

What's real strange is that opening tcp connections inside a c++ program works normally in lldb

#include <iostream>
using namespace std;
int main() {
  int x = system("nc -z os-122307000738.local 80 > /dev/null 2>&1");
  if (x == 0) {
    cout << "success";
  } else {
    cout << "failed";
  }
}