network-analytics / mdt-dialout-collector

Model-Driven Telemetry - Collecting <multi-vendor> metrics via gRPC dialout
MIT License
27 stars 8 forks source link

Problem connecting to interface in container #27

Closed sgaragan closed 3 months ago

sgaragan commented 4 months ago

We have already dealt with this after some conversations with Salvatore et al but we wanted to add a new issue to formally track it

When running pmtelemetryd with the mdt-dialout-collector library, the code cannot bind to a network interface defined by the 'iface' config key when the container is not running as root (which is a security requirement for our environment). Salvatore recommended trying to comment out the following code in src/core/mdt-dialout-core.cc:

    if (setsockopt(fd, SOL_SOCKET, SO_BINDTODEVICE,
        iface.c_str(), strlen(iface.c_str())) != 0) {
        spdlog::get("multi-logger")->
            error("[CustomSocketMutator()]: Unable to bind [{}] "
            "on the configured socket(s)", iface);
        std::abort();
    }

Removing this code worked and allowed the daemon to start up without any issues.

Could the 'iface' config key be made optional (or ignored) when deploying this into a container on Kubernetes/OpenShift? This would probably require a start arg or config key to identify when deploying in a container as well

Thanks, Sean

scuzzilla commented 4 months ago

@sgaragan, thanks for your request.

Checking out the with_container branch will allow you to configure an additional option that bypasses the SO_BINDTODEVICE check:

iface = "dummy";
so_bindtodevice_check = "false";

# other options...

I kindly ask you to perform some functional and regression testing before I proceed with merging the new code into the main branch.

Thanks, Salvatore.

sgaragan commented 4 months ago

I grabbed the with_container branch and rebuilt the pmtelemetryd daemon as well as adding the new config option. The server started with the expected log messages:

[2024-05-24 15:28:03.457] [multi-logger] [debug] constructor: CustomSocketMutator()
[2024-05-24 15:28:03.457] [multi-logger] [debug] constructor: CustomSocketMutator()
INFO ( pmtelemetryd-grpc/core ): maximum telemetry peers allowed: 100
INFO ( pmtelemetryd-grpc/core ): telemetry peers timeout: 300
[2024-05-24 15:28:03.457] [multi-logger] [warning] [CustomSocketMutator()]: SO_BINDTODEVICE check disabled
[2024-05-24 15:28:03.457] [multi-logger] [warning] [CustomSocketMutator()]: SO_BINDTODEVICE check disabled
[2024-05-24 15:28:03.457] [multi-logger] [debug] Socket Type: SOCK_STREAM (TCP)
[2024-05-24 15:28:03.457] [multi-logger] [debug] Socket Type: SOCK_STREAM (TCP)
[2024-05-24 15:28:03.457] [multi-logger] [debug] SO_KEEPALIVE: Disabled
[2024-05-24 15:28:03.457] [multi-logger] [debug] SO_REUSEPORT: Enabled

and the gRPC messages are still flowing in as expected

scuzzilla commented 3 months ago

@sgaragan many thanks for the testing & feedback. I'm gonna merge the new code to main asap.

scuzzilla commented 3 months ago

@sgaragan the new code is now included in the main branch