Closed aleks-v-k closed 8 years ago
Wow, thanks for the PR, @robbrockbank would you be the best person to review this? Also, we might have to ask you (@aleks-v-k) to accept a CLA (although I'm not 100% on that since this is our fork of bird rather than a calico component itself).
Thanks @fasaxc, I have never seen the BIRD project before, so may be I have done something not right. In this case, please tell me where I'm wrong, I will try to rework the fix. It is really needed for our project (kuberdock), because of the mentioned issue with reboot.
Some additional info.
Before applying the patch: route table in bird daemon always contains wrong interfaces for subnets which must be routed via tunnel interface (look at this via birdcl
). It is independent are real routes correct or not. Reconfiguration of bird did not repair routes (birdcl configure
). Reloading of kernel protocol repairs broken routes (birdcl reload kernel1
).
After applying the patch: route table in bird daemon always contain correct routes (with tunnel interface for particular subnets). Rebooting host, reconfiguration of bird daemon, reloading of kernel protocol do not break the routes.
@aleks-v-k We've now got our CLA bot attached to this repo. Would you be able to sign that CLA?
@fasaxc Ok, signed. There are points about third-party work in my contributions in CLA. In this PR I included 2 backported patches from upstream BIRD, is this ok?
@aleks-v-k Yes, I think the license of the upstream code is clear.
@neiljerram I think you said you might get a chance to look at this? One thought: should we sync our master with upstream 1.6 anyway and then incorporate the patch commit?
Signing myself up for notifications: I'm really keen that we can get this fixed.
I wonder if it would be better for us to rebase onto 1.6 than backport fixes. Opinions?
I had a quick glance, and there weren't too many conflicts (but enough that it'll need care). I'm happy to do the rebase if people would prefer that route.
I think we should do both. Taking this PR as is seems like the less risky first step. Then second step can be bringing everything up onto 1.6.
On Tue, Sep 20, 2016 at 6:57 PM Matt Dupre notifications@github.com wrote:
I wonder if it would be better for us to rebase onto 1.6 than backport fixes. Opinions?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/projectcalico/calico-bird/pull/21#issuecomment-248380584, or mute the thread https://github.com/notifications/unsubscribe-auth/AB_hL1EVMq0yPvlIiiI2fddf7E36SdD7ks5qsB54gaJpZM4JmPNU .
@neiljerram As far as I understand, 'new' points to a route in some global static structure, so setting a->iface changes a global route and it is not transient. If we leave it as it was, then in a global routes table there will be wrong interface (not the same that will be send to kernel). So we will have that situation which I'd tried to fix by this patch. Note, please, that this project (BIRD) is pretty new to me, and it is very possible that I misunderstood something. I think you have pretty much understanding of BIRD internals, so if you have some suggestions for another solutions, I'll be very glad to see it.
@neiljerram Finally I've implemented and tested suggested approach. It works, PR was updated. Recheck it, please.
Replaced by https://github.com/projectcalico/calico-bird/pull/25.
In this patchset I tried to fix the issue https://github.com/projectcalico/calico-containers/issues/947 - after kubernetes node rebooting there were broken routes. Firstly, I had backported two bugfixes in kernel protocol from upstream project, those bugfixes did not resolve the issue, so I fixed the issue on top of that backported fixes. I've tested it only with kubernetes - the problem is gone.