Closed sraustein closed 8 years ago
note that this is cache0.sea.rg.net, the supposedly stable system, not cache0.vmini
Trac comment by randy on 2014-01-29T03:38:13Z
So this is a router claiming that it got a withdrawal for a router it didn't have, then blowing the session away.
Which router, pulling from which cache, and do we have any particular reason to believe that this is a bug in the rpki-rtr implementation as opposed to, say, a bug in the router?
If the cache in question is cache0.vmini.rpki.net, it might be due to the restore from four-month-old backup. The software has no way of detecting that, so I doubt it reset the nonce.
If it's some other cache, I don't know what caused this.
Since we don't log every PDU we send from cache to router, there's obvious way to debug this sort of thing after the fact. If it's repeatable, we can debug it, but if it's a one-time fluke we're stuck with using logic, weak reed though it might be.
Trac comment by sra on 2014-01-29T03:42:43Z
Which router
do not know
pulling from which cache
cache0.sea.rg.net
do we have any particular reason to believe that this is a bug in the rpki-rtr implementation as opposed to, say, a bug in the router?
no
Trac comment by randy on 2014-01-29T03:47:02Z
Which router
do not know
Log message you quoted says 119.2.96.6, which DNS says is ge0-1.br2.thimphu.druknet.bt.
Trac comment by sra on 2014-01-29T03:54:58Z
Which router do not know Log message you quoted says 119.2.96.6
so it was a trick question
which DNS says is ge0-1.br2.thimphu.druknet.bt.
a druknet client. i told they could point routers at that cache until they got one of their own up.
Trac comment by randy on 2014-01-29T11:28:30Z
Which router do not know Log message you quoted says 119.2.96.6 so it was a trick question
Not really. Often you know what you did to provoke something. In this case you didn't, so I looked closely at what you'd posted and found useful bread crumbs.
which DNS says is ge0-1.br2.thimphu.druknet.bt.
a druknet client. i told they could point routers at that cache until they got one of their own up.
Might ask what version of the router code they're running, then check with Keyur whether there are relevant known bugs.
We have not seen this message often, but both (?) times we have, it turned out to be bugs on the router side. Which proves nothing, but does suggest it's worth eliminating known bugs already fixed.
Trac comment by sra on 2014-01-29T14:23:27Z
check with Keyur whether there are relevant known bugs.
useless. keyur never thinks there are bugs
Trac comment by randy on 2014-01-29T21:28:57Z
Closed with resolution bug-in-somebody-elses-code
Jan 28 01:00:15 cache0 rtr-origin/server/tcp/119.2.96.6:64822[98899]: [error_report, error #6: 'Prefix 94.229.80.0/20-20 withdrawn but not in database'] Jan 28 01:00:15 cache0 rtr-origin/server/tcp/119.2.96.6:64822[98899]: [Shutting down due to reported fatal protocol error]
_Trac ticket #679 component rtrorigin priority major, owner , created by randy on 2014-01-29T03:33:04Z, last modified 2016-08-05T15:46:52Z