Closed nirs closed 7 months ago
@nirs , so, after restarting gateway pods on both clusters connection recovered , right ?
@nirs , so, after restarting gateway pods on both clusters connection recovered , right ?
Right
I think I understand the root cause, seems that ip rule and ip route tables 100,150 don't exist on cluster dr2.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
What happened: Running minikue clusters connected via submariner using kvm2 driver (each cluster is a vm). After the host running the minikube vms is suspended and resumed, the submariner gateway connection is broken, showing:
Same output when running on the other cluster (dr2).
The connection between the cluster is not healing itself.
What you expected to happen: The connection between the clusters should handle error gracefully and heal itself after errors.
How to reproduce it (as minimally and precisely as possible): Start 3 minikube clusters:
Deploy the broker on the hub:
Connect clusters dr1 and dr2 to the broker:
Wait until all deployments in submariner-operator namespace are rolled out.
Wait until
subctl show all
returns exit code 0 - all connections are ok.Test connectivity - I deployed nginx on both clusters, exported the service and accessed it from the other cluster, and delete the deployment.
Suspend the host running the vm Wait 35 minutes (waiting 1 minute did not reproduce) Wake up the host
Run
subctl show all
orsubctl diangose all
again - showing the errors above.Anything else we need to know?:
@aswinsuryan suggested to delete the gateway pods:
This did not change anything, subctl show all still show an error:
Delete the gateway pod on the other cluster:
Now subctl show that the hosts are connected again:
And the connectivity tests is working again.
Environment:
Diagnose information (use
subctl diagnose all
): (see above)Gather information (use
subctl gather
): submariner-20230621183950.tar.gzCloud provider or hardware configuration: Fedora 37
Install tools: subctl v0.15.1