containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.84k stars 2.42k forks source link

Ordering of `--network` flags is disregarded #12850

Open dpward opened 2 years ago

dpward commented 2 years ago

/kind bug

When creating a container, multiple CNI networks can be connected to it, using this syntax (or alternatives):

podman container create --network=cni-podman2 --network=cni-podman3 --network=cni-podman1

However, the ordering is disregarded when parsing the network flags. This happens because it returns the network names as keys in an unordered map. So the networks are always added to the container in alphabetical order, i.e.

The expected behavior is that

The most obvious way to accomplish this is to add a fourth return value from ParseNetworkFlags(), which is a slice containing the network names in the order they are parsed. (However, there may be other solutions.)

cc: @Luap99

Luap99 commented 2 years ago

Is that actually an issue, there was never a guarantee that the interface name is consistent with the ordering of the network names. That this worked before is more of an implementation detail than a feature.

I don't think we should fix this but if you really need predictable interface names you can use --network net1:interface_name=eth0 and so on.

dpward commented 2 years ago

This has never worked, but I think we should fix this. It's possible to re-assign the interface names as you suggested, but it is still reasonable to expect that the interfaces are added to the container in the order given.

Does the solution I described seem like the best approach here? Rather than changing the existing types?

Luap99 commented 2 years ago

You can already set the interface name in the types.PerNetworkOptions struct so you do not need to alter the function signature at all. But I still do not think we want this. What is your actual use case that would require this?

dpward commented 2 years ago

You can do that - but still if you don't, we would expect the interface ordering on the command line to be preserved. This affects the ifindex as well. (We are running routing software in containers.)

Could you please be a bit more clear about why this is a bad idea, rather than something that simply is not implemented?

Generally speaking, we can infer from other command line options to podman that ordering matters. Notice the handling of --env-file and --hooks-dir.

Luap99 commented 2 years ago

It's not a bad idea I just don't think that guaranteeing this is worth the extra maintenance in the long run. The naming thing can be fixed easily and we could add a test to prevent regressions but this will not fix the ifindex. In the backend we use maps so if we loop over the map the ordering is not deterministic.

dpward commented 2 years ago

It would be very useful to me. I've examined the backend implementation. All we would need to do is to have a slice containing the interface names in the correct order, and iterate over that to obtain the values from the map in the order we want.

Does that sound reasonable if I prepare a PR?

Luap99 commented 2 years ago

Are you talking about only the naming or also the ifindex?

If you need the stable ifindex, what would be the correct order, interface names sorted ascending? I am not sure were you looked to implement this but if you do so you have to do it for both cni and netavark. What happens with podman network connect/disconnect? I still do not think the added complexity is worth the extra maintenance.

Luap99 commented 2 years ago

@mheon WDYT?

dpward commented 2 years ago

I'm talking about just adding the interfaces in the order they appear on the command line:

mheon commented 2 years ago

Would the ability to manually set interface names (e.g. --net cni-podman1:ifname=eth0 --net cni-podman2:ifname=eth1 ...) be acceptable? I'm a little iffy on making ordering in the CLI a critical part of the interface, but I definitely see how consistent interface naming is useful

dpward commented 2 years ago

This wouldn't achieve the same effect (see previous comment).

Here's one example. The ip link command displays interfaces in ifindex order. If we add interfaces to the container in the order they appear on the command line, then this command would list the interfaces that order. For containers with dozens of interfaces, this makes a difference. There is no way to achieve this using --net cni-podman1:ifname=eth0.

(The exception here is host-device networks, which will preserve the existing ifindex from the host.)

dpward commented 2 years ago

As I mentioned earlier, we already make command line ordering critical for other things: --env-file, --hooks-dir, etc. So this wouldn't be new territory.

Luap99 commented 2 years ago

I am not sure how you want to fix this but it is not possible to keep the cli order in the backend since it is stored as a map. You could sort by the interface name but this would still not match the cli ordering if I would use --network net1:interface_name=eth1 --network net2:interface_name=eth0.

Just the naming change is simple if that is enough for you.

mheon commented 2 years ago

I suppose we could automatically add interface_name in order when we first parse into the maps (if it's not explicitly set)

Luap99 commented 2 years ago

@mheon Yes that is easy but he also wants the ifindex to be in the same order and this depends on the order in which the interfaces are created

mheon commented 2 years ago

Ahh. Yeah, that would be more difficult.

dpward commented 2 years ago

In the libpod API, we already have the cni_networks attribute, which is an ordered list of network names to be used in the container. If we specify that, then we can add the networks in that order.

The documentation says that the new Networks attribute is supposed to be used in favor of that, although cni_networks remains for compatibility. I'm actually a bit surprised we wound up with both Networks and network_options here as the result: those are both maps, where the key is the network name, and the value contains options specific to that network. It seems like it would have made more sense to use network_options to carry the attributes found in Networks; or to provide a new type that is designed to replace network_options and not cni_networks. (I realize I haven't been following the development discussions here)

Luap99 commented 2 years ago

The API level doesn't matter here, we are talking about the backend. It is stored as map in the db so there is no way to keep the order. We would need to change the types and db to make this work which is not worth the effort also podman 4.0 is about to be released very soon so there is no time to change this.

cni_networks is for backwards compatibility only network_options is only used for slirp4netns at the moment, I could have used that for the networks and there is no particular reason why I didn't (maybe that would have made more sense in hindsight I don't know)

Note that just looking at ip link output is not an argument for me. We can make the interface naming change but the ifindex things is not a simple as you think.

dpward commented 2 years ago

I realize it's a bit more involved than at first look, but I'd still like to figure out what a workable solution would be here looking ahead to future releases, so that we can add interfaces in command line order.

To provide another example, let's say I have two host-device networks. If attaching the first one fails, then podman won't attempt to attach the second. When these networks are returned to the host, they don't have the same configuration they had originally. Having determinism in the order that interfaces are added to the container would again be useful here. Just renaming the interfaces after the fact overlooks these kinds of details that may matter in certain cases.

daiaji commented 2 years ago

This should be helpful for using soft routing in containers.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 2 years ago

@Luap99 Is this still an issue?

Luap99 commented 2 years ago

Yes

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 2 years ago

@Luap99 What do you think about doing something like the following patch and retaining the order that the user added the objects.

order.patch.txt

Luap99 commented 2 years ago

I am not a huge fan of adding a new field for this, this will everything more complicated, this can be set by api users so we would need to check for invalid user input. IMO if we want to make it deterministic we should sort by the interface name, then we can apply the names in the frontent in the order which --network was given. But adding the names in the frontent would also conflict with https://github.com/containers/podman/issues/13174

rhatdan commented 2 years ago

Let discuss this on Monday with @mheon at standup.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 2 years ago

@Luap99 @mheon We never followed up on this. What do you want to do with this one?

mheon commented 2 years ago

I'm OK with fixing but IMO this is low priority. Exact way to implement still requires some discussion.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 1 year ago

Since this was low priority and no one ever acted on it, I am going to close. Reopen if you want to work on it.

dpward commented 1 year ago

@rhatdan Please keep this open. I still very much have a need for this. My organization has active support agreements from Red Hat, if we need to coordinate this through our TAM.

Luap99 commented 1 year ago

I agree that is a valid feature request, it is just that work for it seems rather high. If you have support agreements then please use them.

livelace commented 2 months ago

it affects my setup, i need specific network interfaces be placed into specific bridges for multus' proper network mapping. i was confident that --network arg is ordered.