k8snetworkplumbingwg / ovs-cni

Open vSwitch CNI plugin
Apache License 2.0
224 stars 71 forks source link

OVS Traffic Mirroring Proposal #227

Closed l-rossetti closed 2 years ago

l-rossetti commented 2 years ago

This is a proposal for supporting the traffic mirroring feature of OVS into ovs-cni plugin. The topic has been initially discussed in issue 219. The goal is to create and manage multiple mirror ports through either ovs-cni or a dedicated cni-plugin in Network Configuration List (Multus) as defined in CNI spec 0.4.0.

Added support for port mirroring
kubevirt-bot commented 2 years ago

Hi @l-rossetti. Thanks for your PR.

I'm waiting for a k8snetworkplumbingwg member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
phoracek commented 2 years ago

Hey @l-rossetti, just wanted to check - is there any new development on this? Any way I can help?

l-rossetti commented 2 years ago

Hey @l-rossetti, just wanted to check - is there any new development on this? Any way I can help?

Hi @phoracek! We just updated the PR based on the discussion. Could please review the proposal again? If it looks ok, we can draft a first implementation of the two plugins: ovs-mirror-producer, ovs-mirror-consumer. If you agree, we can start by focusing only on the cmdAdd, leaving cmdCheck and cmdDel for a second round.

We still have a doubt about the ingress and egress keywords: in the NAD should we consider ingress and egress from the point of view of OVS or the point of view of the Pod? OVS-DB specification states that:

select_dst_port: set of weak reference to Ports
     Ports on which departing packets are selected for mirroring.
select_src_port: set of weak reference to Ports
     Ports on which arriving packets are selected for mirroring.

But here we could take the point of view of the pod if it makes sense.

Moreover, where do you suggest to put the code? Is it ok if we create two public repos in our organization with the initial draft? Do you suggest to start with only one repo for both consumer and producer, or two dedicated repo? Anyway, please, feel free to suggest what you think is more suitable to everyone.

phoracek commented 2 years ago

Apologies for my late reply.

Thanks for the update. I left only some suggestions to fix formatting, but the proposal sounds good to me.

If you find ingress and egress problematic, we can stick to OVS' naming, that works fine.

About the code, for me the ideal process would be following:

  1. You can create a quick PoC, ignoring the code quality (use bash if it helps you get it done faster), just to get a demo running to see if it will all work and get a feeling for it.
  2. Extend this repository with the new functionality. Unlike the PoC, this will require well organized code, e2e tests, documentation. We can split this step to cmdAdd, cmdDel, cmdCheck. Or even start by implementing just a subset of the API. The only condition is that after each merge, we will be able to exercise and test the code.

I think that it would be useful to keep the two new CNIs under this project, to leverage the existing infrastructure. But if you'd prefer to keep it outside, I think that at least keeping both consumer and producer under the same roof would be better over splitting it.

That being said, if you'd prefer a different workflow, I'm totally happy to discuss that.

l-rossetti commented 2 years ago

Hi @phoracek, we ended up with a new repository as a quick PoC containing the implementation for both mirror-producer and mirror-consumer: auroralabs-eu/ovs-cni-mirror There's a minimal implementation for cmdAdd, cmdDel and cmdCheck.

Some notes:

CleanPorts workaround step-by-step

Doubts:

Do you have any more suggestion/idea to improve the implementation/integration?

phoracek commented 2 years ago

This looks great!

About the cleanup issue, would it help if we called the cleanup function on the beginning of every cmdAdd too? To make sure we start with a clean state. We should do that in both OVS CNI and the mirror CNI. If a node was forcefully restarted, DEL would not be called, so we should do it in ADD.

As for why it is not executed even in DEL, we execute the cleanup function only when something does not go as expected. If DEL progresses well, we should remove all the ports explicitly. Only if there is an issue during the DEL command, we resort to best effort cleanup using cleanPorts https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/pkg/plugin/plugin.go#L519. However, that means that leftovers from other contains would not get cleaned up. Perhaps we could unconditionally call cleanPorts on every DEL.

I don't know why CRI is not running DEL, but I would suspect that they rely on ADD being transactional - if we fail ADD, we should clean up after ourselves.

I can't think of a good solution for the convertToArray issue.

I did a high level review and it looks pretty straightforward. I have just a few notes/questions:

l-rossetti commented 2 years ago

About the cleanup issue, would it help if we called the cleanup function on the beginning of every cmdAdd too? To make sure we start with a clean state. We should do that in both OVS CNI and the mirror CNI. If a node was forcefully restarted, DEL would not be called, so we should do it in ADD.

Yes, we agree it is better to always call it at every ADD, knowing the state is always the cleanest possible. The thing is that the clean-up should be invoked only by ovs-cni plugins, because it involves ports/interfaces, while ovs-mirror plugin should treat only mirrors and should attach the ports created by ovs-cni to the mirror itself. Is shouldn't be its responsibility to delete ports/interfaces that it didn't create.

As for why it is not executed even in DEL, we execute the cleanup function only when something does not go as expected. If DEL progresses well, we should remove all the ports explicitly. Only if there is an issue during the DEL command, we resort to best effort cleanup using cleanPorts https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/pkg/plugin/plugin.go#L519. However, that means that leftovers from other contains would not get cleaned up. Perhaps we could unconditionally call cleanPorts on every DEL.

Ok , we understand the logic behind. But yes, as before, we may check the state of the db at each ovs-cni invocation. This would simplify also the logic built around the cleanup caller in ovs-cni. On the other side may seem overkill.

I don't know why CRI is not running DEL, but I would suspect that they rely on ADD being transactional - if we fail ADD, we should clean up after ourselves.

Probably we misunderstood the spec in case of chained plugin: can't find an explicit sentence specifying that a DEL for the upstream plugin (e.g. ovs-cni) of the chain must be called by the CRI in case the downstream one (e.g. ovs-mirror) fails.

On the other hand we found this in the plugin-delegation section of the SPEC: "If, on ADD, a delegated plugin fails, the "upper" plugin should execute again with DEL before returning failure."

So this doubt comes to our minds: are we sure ovs-cni-mirror should be implemented as a dedicated plugin instead of a delegated plugin (as done for the IPAM one)? Cause actually the ovs-cni-mirror will never be called by itself but it will be always in chain with ovs-cni, cause is responsibility of the latter to create the interface/port. The sentence of the spec I quoted above would allow us to avoid the call to the cleanup function in case of ovs-mirror failure. But in that case there are some other impacts to evaluate. What do you think?

Why does the consumer need previous interface? It makes sense for producers, where we want to sniff the traffic of the previous interface. But consumer interface should be not dependent on the previous one, no? I'd expect that I can create a consumer without connecting the container to the bridge https://github.com/auroralabs-eu/ovs-cni-mirror/blob/master/pkg/consumer/consumer.go#L94

It is needed because the consumer port/interface created by ovs-cni must be added as "output_port" in the mirror table. So we need the previousResult to be able to fetch the port reference and add it to the output_port column of the table here.

Is it not possible to have more than one consumer on a mirror? https://github.com/auroralabs-eu/ovs-cni-mirror/blob/master/pkg/consumer/consumer.go#L106

Unfortunately not, using 'output_port', cause OvsDB-Schema manual specifies at page 41 that "output_port" is an "optional weak reference to Port", not a "Set of weak references to Ports".

We see only two different options to have more than one consumer for the same tarffic:

About the latter option we excluded it in our proposal to fulfill the requirements of our project and we don't have enough knowledge about RSPAN. For sure this would be a nice to have improvement for the future.

Ks89 commented 2 years ago

We have a question about UUIDs, because here we are using the mirrorName as done in ovs-cni with portName. However, mirrorName is a value passed by the user via NAD (also with special characters like "-" which is accepted as mirror-name by ovs), so it must be validated and cleaned to be used as "Operation.UUIDName". Instead, in ovs-cni this is not required, because portName is generated from interfaceName.

How could we create a valid UUIDName for mirrors? For example we can create a random string with a fixed length and without special characters. Do you have a better solution? Thanks

l-rossetti commented 2 years ago

Hi @phoracek how are you doing? Any comment about the previous messages?

phoracek commented 2 years ago

Hello,

Ok , we understand the logic behind. But yes, as before, we may check the state of the db at each ovs-cni invocation. This would simplify also the logic built around the cleanup caller in ovs-cni. On the other side may seem overkill.

If this simplifies the code, I don't mind. I don't imagine it would have a terrible performance impact since we access the OVS database using the unix socket.

So this doubt comes to our minds: are we sure ovs-cni-mirror should be implemented as a dedicated plugin instead of a delegated plugin (as done for the IPAM one)? Cause actually the ovs-cni-mirror will never be called by itself but it will be always in chain with ovs-cni, cause is responsibility of the latter to create the interface/port. The sentence of the spec I quoted above would allow us to avoid the call to the cleanup function in case of ovs-mirror failure. But in that case there are some other impacts to evaluate.

In case of IPAM, the IPAM CNI returns the IP that should be configured. And then the actual IP assignment is done by the main CNI. That is unlike mirror CNI where the configuration is actually applied by the CNI.

The reference plugins under https://github.com/containernetworking/plugins/tree/main/plugins/meta are all relevant only when chained with some other plugin, like mirror. Those plugins probably suffer the same cleanup issue. Maybe we should raise that limitation as a bug of CNI, so it could be resolved for all?

I'm not totally against the idea of using delegates, but we should have a good reason to break this pattern.

About the latter option we excluded it in our proposal to fulfill the requirements of our project and we don't have enough knowledge about RSPAN. For sure this would be a nice to have improvement for the future.

+1 on keeping it simple for now with the first option.

How could we create a valid UUIDName for mirrors? For example we can create a random string with a fixed length and without special characters. Do you have a better solution?

Generating a random UUID sounds fine. I assume we can then keep the mirror name as other-config, or something similar?

If I missed any other question, please don't hesitate to bring it up.

Ks89 commented 2 years ago

Hi @phoracek following your suggestion here we merged this repo into ovs-cni.

The main changes with respect to the previous commit are:

To discuss:

phoracek commented 2 years ago
* integrate ovs-cni-mirror in your CI/CD, please could you provide more information about this topic?

We should get at least a silver bullet test verifying that the solution works into our e2e test suite, running on Kubernetes. All that should be required is to introduce a new module under tests/, similar to https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/tests/ovs_test.go.

Rest of the test cases may be covered in integration tests, similar to this one: https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/pkg/plugin/plugin_test.go. These can run without Kubernetes and are considerably faster to execute.

* do you think that the documentation `traffic-mirroring.md` is complete enough?

I like the contents of the docs, thanks for pictures and plenty of examples! The only extra thing I'd ask for is a sentence or two about each of the cases. I may have some minor formatting comments once I get to review it in detail.

* what is missing in this PR to be eligible for the merge?

Tests and thorough review. Once you post the test coverage, I would try it out locally and then review the whole PR. I skimmed through the code and it seems that wast majority of the changes only add new code and modules, leaving existing code intact, so I don't anticipate issues there.

phoracek commented 2 years ago

/test pull-e2e-ovs-cni

phoracek commented 2 years ago

/retest

phoracek commented 2 years ago

/retest

phoracek commented 2 years ago

/retest

phoracek commented 2 years ago

@Ks89 tests are green now \o/ let me know when it will be ready for a review

Ks89 commented 2 years ago

@Ks89 tests are green now \o/ let me know when it will be ready for a review

ok, thanks. I need to write some other tests, but I hope to complete everything next week.

Ks89 commented 2 years ago

/retest

kubevirt-bot commented 2 years ago

@Ks89: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to [this](https://github.com/k8snetworkplumbingwg/ovs-cni/pull/227#issuecomment-1210361988): >/retest Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
phoracek commented 2 years ago

/retest

Ks89 commented 2 years ago

Hi @phoracek, we added unit tests and e2e tests as requested and improved a little bit the code.

Some point to notice:

Warning  FailedKillPod  5s (x11 over 2m12s)  kubelet            error killing pod: failed to "KillPodSandbox" for "67e847f2-1a74-43c3-820b-c6a33e1df363" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for pod sandbox k8s_pod-prod-2_test-namespace_67e847f2-1a74-43c3-820b-c6a33e1df363_0(2bc2f7c62a00a1d5f9729d3436be2d9f754c6cb665526993ebb8a51356823402): error removing pod test-namespace_pod-prod-2 from CNI network \"multus-cni-network\": delegateDel: error invoking ConflistDel - \"ovs-net-prod\": conflistDel: error in getting result from DelNetworkList: netplugin failed with no error message"
phoracek commented 2 years ago

Awesome, thanks! I bumped the version of Multus via https://github.com/k8snetworkplumbingwg/ovs-cni/pull/238. I'll go over the PR and the rest of your points during this week.

/retest

phoracek commented 2 years ago

/retest

l-rossetti commented 2 years ago

/retest-required

kubevirt-bot commented 2 years ago

@l-rossetti: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to [this](https://github.com/k8snetworkplumbingwg/ovs-cni/pull/227#issuecomment-1217553902): >/retest-required Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
phoracek commented 2 years ago

/retest

phoracek commented 2 years ago

Running the tests locally, I noticed that removal of pods using the new CNI fail due to nil pointer dereference. That causes them to get stuck terminating.

@l-rossetti @Ks89 could you please look into that? It does not fail the test but since it gets the pod stuck for several minutes, it leads into a timout of the test execution.

   Warning  FailedKillPod   8s (x2 over 22s)  kubelet            (combined from similar events): error killing pod: failed to "KillPodSandbox" for "53299780-60c3-4a98-94bd-481bd3b10f5e" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for pod sandbox k8s_pod-prod-2_test-namespace_53299780-60c3-4a98-94bd-481bd3b10f5e_0(7f744c750ec5aa5778cac31419d701f90c66ee660792d6b96c7d326637f8244b): error removing pod test-namespace_pod-prod-2 from CNI network \"multus-cni-network\": delegateDel: error invoking ConflistDel - \"ovs-net-prod\": conflistDel: error in getting result from DelNetworkList: netplugin failed: \"2022/08/19 12:13:57 CNI DEL was called for container ID: 7f744c750ec5aa5778cac31419d701f90c66ee660792d6b96c7d326637f8244b, network namespace /var/run/netns/0d7b05ac-6229-4e18-8d07-c1a4b2d833b3, interface name net1, configuration: {\\\"bridge\\\":\\\"br-test\\\",\\\"cniVersion\\\":\\\"0.3.0\\\",\\\"mirrors\\\":[{\\\"egress\\\":true,\\\"ingress\\\":true,\\\"name\\\":\\\"mirror-1\\\"}],\\\"name\\\":\\\"ovs-net-prod\\\",\\\"type\\\":\\\"ovs-cni-mirror-producer\\\"}\\npanic: runtime error: invalid memory address or nil pointer dereference\\n[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x64341a]\\n\\ngoroutine 1 [running, locked to thread]:\\npanic({0x66e240, 0x893b40})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/panic.go:941 +0x397 fp=0xc000135b40 sp=0xc000135a80 pc=0x432c97\\nruntime.panicmem(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/panic.go:220\\nruntime.sigpanic()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/signal_unix.go:818 +0x336 fp=0xc000135b90 sp=0xc000135b40 pc=0x4485b6\\ngithub.com/k8snetworkplumbingwg/ovs-cni/pkg/mirror-producer.CmdDel(0xc000158ee0)\\n\\t/home/phoracek/code/ovs-cni/pkg/mirror-producer/producer.go:132 +0x1da fp=0xc000135cd0 sp=0xc000135b90 pc=0x64341a\\ngithub.com/containernetworking/cni/pkg/skel.(*dispatcher).checkVersionAndCall(0xc000135eb8, 0xc000158ee0, {0x725d08, 0xc00015e060}, 0x6da910)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/containernetworking/cni/pkg/skel/skel.go:166 +0x20a fp=0xc000135d60 sp=0xc000135cd0 pc=0x524c8a\\ngithub.com/containernetworking/cni/pkg/skel.(*dispatcher).pluginMain(0xc000135eb8, 0x894f60?, 0xc000135ea0?, 0x44d489?, {0x725d08, 0xc00015e060}, {0xc00001c2a0, 0x25})\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/containernetworking/cni/pkg/skel/skel.go:242 +0x287 fp=0xc000135e58 sp=0xc000135d60 pc=0x5251a7\\ngithub.com/containernetworking/cni/pkg/skel.PluginMainWithError(...)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/containernetworking/cni/pkg/skel/skel.go:275\\ngithub.com/containernetworking/cni/pkg/skel.PluginMain(0x6ade80?, 0x10?, 0xc000135f50?, {0x725d08?, 0xc00015e060?}, {0xc00001c2a0?, 0x405bb9?})\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/containernetworking/cni/pkg/skel/skel.go:290 +0xd1 fp=0xc000135f08 sp=0xc000135e58 pc=0x525791\\nmain.main()\\n\\t/home/phoracek/code/ovs-cni/cmd/mirror/producer/main.go:27 +0x153 fp=0xc000135f80 sp=0xc000135f08 pc=0x643c93\\nruntime.main()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:250 +0x212 fp=0xc000135fe0 sp=0xc000135f80 pc=0x435812\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000135fe8 sp=0xc000135fe0 pc=0x461981\\n\\ngoroutine 2 [force gc (idle)]:\\nruntime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000034fb0 sp=0xc000034f90 pc=0x435bd6\\nruntime.goparkunlock(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:367\\nruntime.forcegchelper()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:301 +0xad fp=0xc000034fe0 sp=0xc000034fb0 pc=0x435a6d\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000034fe8 sp=0xc000034fe0 pc=0x461981\\ncreated by runtime.init.6\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:289 +0x25\\n\\ngoroutine 3 [GC sweep wait]:\\nruntime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000035790 sp=0xc000035770 pc=0x435bd6\\nruntime.goparkunlock(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:367\\nruntime.bgsweep(0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgcsweep.go:278 +0x8e fp=0xc0000357c8 sp=0xc000035790 pc=0x422cee\\nruntime.gcenable.func1()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:177 +0x26 fp=0xc0000357e0 sp=0xc0000357c8 pc=0x4188e6\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000357e8 sp=0xc0000357e0 pc=0x461981\\ncreated by runtime.gcenable\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:177 +0x6b\\n\\ngoroutine 4 [runnable]:\\nruntime.(*pageAlloc).scavenge(0x8b8c28, 0x10000)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgcscavenge.go:441 +0x15c fp=0xc000035f20 sp=0xc000035f18 pc=0x420e3c\\nruntime.bgscavenge(0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgcscavenge.go:336 +0x32e fp=0xc000035fc8 sp=0xc000035f20 pc=0x420bce\\nruntime.gcenable.func2()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:178 +0x26 fp=0xc000035fe0 sp=0xc000035fc8 pc=0x418886\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000035fe8 sp=0xc000035fe0 pc=0x461981\\ncreated by runtime.gcenable\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:178 +0xaa\\n\\ngoroutine 5 [finalizer wait]:\\nruntime.gopark(0x0?, 0xc000034670?, 0x70?, 0x47?, 0x442611?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000034630 sp=0xc000034610 pc=0x435bd6\\nruntime.goparkunlock(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:367\\nruntime.runfinq()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mfinal.go:177 +0xb3 fp=0xc0000347e0 sp=0xc000034630 pc=0x417993\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000347e8 sp=0xc0000347e0 pc=0x461981\\ncreated by runtime.createfing\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mfinal.go:157 +0x45\\n\\ngoroutine 6 [IO wait]:\\nruntime.gopark(0xc000101520?, 0xc00002d900?, 0x20?, 0x3c?, 0x49d342?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000093bb0 sp=0xc000093b90 pc=0x435bd6\\nruntime.netpollblock(0xc000200000?, 0x3e00?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/netpoll.go:522 +0xf7 fp=0xc000093be8 sp=0xc000093bb0 pc=0x42e997\\ninternal/poll.runtime_pollWait(0x7ff0c2782198, 0x72)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/netpoll.go:302 +0x89 fp=0xc000093c08 sp=0xc000093be8 pc=0x45cec9\\ninternal/poll.(*pollDesc).wait(0xc00010e380?, 0xc000200000?, 0x0)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/internal/poll/fd_poll_runtime.go:83 +0x32 fp=0xc000093c30 sp=0xc000093c08 pc=0x4af4b2\\ninternal/poll.(*pollDesc).waitRead(...)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/internal/poll/fd_poll_runtime.go:88\\ninternal/poll.(*FD).Read(0xc00010e380, {0xc000200000, 0x3e00, 0x3e00})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/internal/poll/fd_unix.go:167 +0x25a fp=0xc000093cb0 sp=0xc000093c30 pc=0x4afe1a\\nnet.(*netFD).Read(0xc00010e380, {0xc000200000?, 0xc000706090?, 0xc000093d60?})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/net/fd_posix.go:55 +0x29 fp=0xc000093cf8 sp=0xc000093cb0 pc=0x4ee1c9\\nnet.(*conn).Read(0xc00000e6a8, {0xc000200000?, 0x4d218d?, 0x0?})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/net/net.go:183 +0x45 fp=0xc000093d40 sp=0xc000093cf8 pc=0x4f8f05\\nencoding/json.(*Decoder).refill(0xc00010ac80)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/encoding/json/stream.go:165 +0x17f fp=0xc000093d90 sp=0xc000093d40 pc=0x4d5c1f\\nencoding/json.(*Decoder).readValue(0xc00010ac80)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/encoding/json/stream.go:140 +0xbb fp=0xc000093de0 sp=0xc000093d90 pc=0x4d581b\\nencoding/json.(*Decoder).Decode(0xc00010ac80, {0x657860, 0xc000012700})\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/encoding/json/stream.go:63 +0x78 fp=0xc000093e10 sp=0xc000093de0 pc=0x4d5558\\ngithub.com/cenkalti/rpc2/jsonrpc.(*jsonCodec).ReadHeader(0xc0000126e0, 0xc000180000, 0xc000180018)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/cenkalti/rpc2/jsonrpc/jsonrpc.go:94 +0x85 fp=0xc000093e78 sp=0xc000093e10 pc=0x639ac5\\ngithub.com/cenkalti/rpc2.(*Client).readLoop(0xc000158fc0)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/cenkalti/rpc2/client.go:87 +0x11b fp=0xc000093fb0 sp=0xc000093e78 pc=0x63733b\\ngithub.com/cenkalti/rpc2.(*Client).Run(0x0?)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/cenkalti/rpc2/client.go:62 +0x19 fp=0xc000093fc8 sp=0xc000093fb0 pc=0x6371f9\\ngithub.com/ovn-org/libovsdb/client.(*ovsdbClient).createRPC2Client.func3()\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:238 +0x26 fp=0xc000093fe0 sp=0xc000093fc8 pc=0x63c7c6\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000093fe8 sp=0xc000093fe0 pc=0x461981\\ncreated by github.com/ovn-org/libovsdb/client.(*ovsdbClient).createRPC2Client\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:238 +0x3e5\\n\\ngoroutine 7 [GC worker (idle)]:\\nruntime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000036f58 sp=0xc000036f38 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc000036fe0 sp=0xc000036f58 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000036fe8 sp=0xc000036fe0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 8 [GC worker (idle)]:\\nruntime.gopark(0x6d32f8b6d0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000037758 sp=0xc000037738 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc0000377e0 sp=0xc000037758 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000377e8 sp=0xc0000377e0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 9 [GC worker (idle)]:\\nruntime.gopark(0x8d05e0?, 0x1?, 0x5c?, 0xaa?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000037f58 sp=0xc000037f38 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc000037fe0 sp=0xc000037f58 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000037fe8 sp=0xc000037fe0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 10 [runnable]:\\nruntime.gcMarkDone()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:775 +0x2be fp=0xc000030758 sp=0xc000030750 pc=0x4198de\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1331 +0x2c5 fp=0xc0000307e0 sp=0xc000030758 pc=0x41aba5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000307e8 sp=0xc0000307e0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 11 [GC worker (idle)]:\\nruntime.gopark(0x8d05e0?, 0x2?, 0x1e?, 0xaa?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000030f58 sp=0xc000030f38 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc000030fe0 sp=0xc000030f58 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000030fe8 sp=0xc000030fe0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 12 [GC worker (idle)]:\\nruntime.gopark(0x6d32f8eb8a?, 0x3?, 0xcf?, 0x4b?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000031758 sp=0xc000031738 pc=0x435bd6\\nruntime.gcBgMarkWorker()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1207 +0xe5 fp=0xc0000317e0 sp=0xc000031758 pc=0x41a9c5\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000317e8 sp=0xc0000317e0 pc=0x461981\\ncreated by runtime.gcBgMarkStartWorkers\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/mgc.go:1131 +0x25\\n\\ngoroutine 13 [chan receive]:\\nruntime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000031e40 sp=0xc000031e20 pc=0x435bd6\\nruntime.chanrecv(0xc000062300, 0x0, 0x1)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/chan.go:577 +0x56c fp=0xc000031ed0 sp=0xc000031e40 pc=0x40620c\\nruntime.chanrecv1(0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/chan.go:440 +0x18 fp=0xc000031ef8 sp=0xc000031ed0 pc=0x405c78\\ngithub.com/ovn-org/libovsdb/client.(*ovsdbClient).handleDisconnectNotification(0xc000012630)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:504 +0x57 fp=0xc000031fc8 sp=0xc000031ef8 pc=0x63dfb7\\ngithub.com/ovn-org/libovsdb/client.(*ovsdbClient).connect.func4()\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:220 +0x26 fp=0xc000031fe0 sp=0xc000031fc8 pc=0x63c226\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc000031fe8 sp=0xc000031fe0 pc=0x461981\\ncreated by github.com/ovn-org/libovsdb/client.(*ovsdbClient).connect\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:220 +0x8b3\\n\\ngoroutine 14 [select]:\\nruntime.gopark(0xc0000326f0?, 0x2?, 0x0?, 0x0?, 0xc0000326dc?)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/proc.go:361 +0xd6 fp=0xc000032550 sp=0xc000032530 pc=0x435bd6\\nruntime.selectgo(0xc0000326f0, 0xc0000326d8, 0x0?, 0x0, 0x0?, 0x1)\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/select.go:328 +0x772 fp=0xc000032690 sp=0xc000032550 pc=0x444bf2\\ngithub.com/ovn-org/libovsdb/cache.(*eventProcessor).Run(0xc000241e00, 0xc0000622a0)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/cache/cache.go:573 +0xae fp=0xc0000327a0 sp=0xc000032690 pc=0x555b4e\\ngithub.com/ovn-org/libovsdb/cache.(*TableCache).Run(0x0?, 0x0?)\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/cache/cache.go:480 +0x1d fp=0xc0000327c0 sp=0xc0000327a0 pc=0x5554bd\\ngithub.com/ovn-org/libovsdb/client.(*ovsdbClient).connect.func5()\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:221 +0x2a fp=0xc0000327e0 sp=0xc0000327c0 pc=0x63c1ca\\nruntime.goexit()\\n\\t/home/phoracek/code/ovs-cni/build/_output/bin/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0000327e8 sp=0xc0000327e0 pc=0x461981\\ncreated by github.com/ovn-org/libovsdb/client.(*ovsdbClient).connect\\n\\t/home/phoracek/code/ovs-cni/vendor/github.com/ovn-org/libovsdb/client/client.go:221 +0x933\\n\""

I think it comes from: https://github.com/k8snetworkplumbingwg/ovs-cni/blob/23b80f28c0a5ed076e41056aa2f652831c66fee1/pkg/mirror-producer/producer.go#L132

phoracek commented 2 years ago

/retest

phoracek commented 2 years ago

/retest

phoracek commented 2 years ago

/retest

Ks89 commented 2 years ago

Ok, tests are ok now.

phoracek commented 2 years ago

Awesome! I'll get to review the code now.

Ks89 commented 2 years ago

@phoracek We pushed all changes with fixes and code refactoring. There are some requests above with comments to ask more information.

phoracek commented 2 years ago

/retest

Ks89 commented 2 years ago

Now there are these two errors:

/tmp/src/github.com/k8snetworkplumbingwg/ovs-cni/.gopath/src/github.com/k8snetworkplumbingwg/ovs-cni/pkg/testhelpers/mirror.go:13:2: should not use dot imports
/tmp/src/github.com/k8snetworkplumbingwg/ovs-cni/.gopath/src/github.com/k8snetworkplumbingwg/ovs-cni/pkg/testhelpers/mirror.go:14:2: should not use dot imports

I copied this code

    . "github.com/onsi/ginkgo"
    . "github.com/onsi/gomega"

from plugin_test.go.

Why is it a problem in testhelpers/mirror.go and not in plugin_test.go? Probably it's related to the file name "_test.go". How can we fix this? Should we use testhelpers/mirror_test.go or skip that file in the golint command?


Edit:

Would you mind removing the dot and referencing to those packages explicilty? e.g. gomega.Eventually

Ks89 commented 2 years ago

yes no problem, I can remove the dots

phoracek commented 2 years ago

Oh :D sorry, I edited your message instead of responding to it...

Ks89 commented 2 years ago

Oh :D sorry, I edited your message instead of responding to it...

updated removing the dot imports and fixed a stupid bug in e2e tests after renaming ovs-cni-mirrror- into ovs-mirror-.

Also added a big comment to the discussion above https://github.com/k8snetworkplumbingwg/ovs-cni/pull/227#discussion_r960739841

phoracek commented 2 years ago

/retest

phoracek commented 2 years ago

/retest

phoracek commented 2 years ago

Thanks so much @l-rossetti @Ks89 this is really awesome. I appreciate the extra work you put into this. Kudos!

/lgtm /approve

kubevirt-bot commented 2 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: l-rossetti, phoracek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/OWNERS)~~ [phoracek] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment