FireCracker: random port number calculation should be more robust

schaefi commented 1 year ago

The following code from firecracker-pilot/src/firecracker.rs can be improved as written in the fixme note

pub fn get_exec_port() -> u32 {
    /*!
    Create random execution port
    !*/
    let mut random = rand::thread_rng();
    // FIXME: A more stable version
    // should check for already running socket connections
    // and if the same number is used for an already running one
    // another rand should be called
    let exec_port = random.gen_range(49200..60000);
    exec_port
}

schaefi commented 1 year ago

going to look into this on the next suse hackweek which is coming soon

Ichmed commented 1 year ago

@m-kat was looking into it, we were thinking about writing a kernel module that takes care of assigning the port numbers. Basically just a Semaphor with a get_next_port() method.

schaefi commented 1 year ago

@m-kat was looking into it

well, it felt pretty orphan to me. I remember we discussed how to solve this 2 month ago. As nothing happened by then I assumed the assignee statement is orphan

we were thinking about writing a kernel module that takes care of assigning the port numbers

a kernel module ? That sounds over engineered

Basically just a Semaphor with a get_next_port() method.

An acceptable stable solution could be imho much simpler. Every vsock connection initiated by a firecracker pilot currently creates a socket file named:

let vsock_uds_path = format!(
    "/run/sci_cmd_{}.sock", get_meta_name(program_name)
);

I would change this name to also contain the assigned exec_port

let vsock_uds_path = format!(
    "/run/sci_cmd_{}_{}.sock", get_meta_name(program_name), exec_port
);

Next I would change the method get_exec_port() in a way that it starts with a number X walks through the existing sockets, checks if that number X is already taken and increases the value by X+1 until a free port number was found.

So we get rid of the random port assignment and should be safe in terms of already taken vsock ports used by the firecracker pilot.

Of course there could be other processes in the system that occupies a vsock port and firecracker-pilot doesn't know about them. For such cases I suggest to also implement an option to allow to specify the communication port at call time in a similar fashion as it already exists for the podman-pilot.

Hard coupling the functionality of the firecracker-pilot to a kernel-module that finds an unused vsock port will be a generic solution but introduces a lot of other headaches like you are creating a coupling to the host kernel when the idea was to exactly prevent that with a VM. Also the use case for firecracker-pilot is on embedded systems to isolate apps from the host kernel. That there are other processes on the host running that conflicts with firecracker and its vsocks is imho unlikely and also questionable design. If for some reason different solutions on the host runs that uses vsocks it should also be a task of the integrator of such a system that the port range for the vsocks is not in conflict with other processes.

From that perspective I don't see where the investment in an extra kernel module offers more advantages than disadvantages

my2cents

schaefi commented 1 year ago

You can still go with a kernel module exposing a semaphore and use it if it exists as additional way implemented in get_exec_port(). But I strongly recommend that a first solution should follow the keep-it-simple pragma and there should be by no means a hard coupling to such a kernel module, optional ok and nice but not required

Elektrobit / flake-pilot

FireCracker: random port number calculation should be more robust #114