Closed yuhao-he closed 1 year ago
The interesting lines are:
Mar 24 17:40:52 routinator-west-01 routinator[918]: rsyncing from rsync://repository.lacnic.net/rpki/.
Mar 24 17:41:14 routinator-west-01 routinator[918]: rsync://repository.lacnic.net/rpki: successfully completed.
Mar 24 17:41:32 routinator-west-01 routinator[918]: RRDP hash mismatch in local file rsync://repository.lacnic.net/rpki/lacnic/f4bc5d7e-ea3a-4ab3-a8a5-5903eb4f1726/aa536e2112fb6fbd4fd90b644c6fd66acba797ea.crl.
Mar 24 17:52:19 routinator-west-01 routinator[918]: rsync://repository.lacnic.net/rpki/lacnic/48f083bb-f603-4893-9990-0284c04ceb85/fd25c9bb7e5cac7419fa9193770f64a6edf20c19.cer: file has wrong manifest hash.
Mar 24 17:52:19 routinator-west-01 routinator[918]: CA for rsync://repository.lacnic.net/rpki/lacnic/48f083bb-f603-4893-9990-0284c04ceb85/ rejected, resources marked as unsafe:
Mar 24 17:52:19 routinator-west-01 routinator[918]: 0.0.0.0/0
Mar 24 17:52:19 routinator-west-01 routinator[918]: ::/0
Mar 24 17:52:19 routinator-west-01 routinator[918]: AS0-AS4294967295
My interpretation is that the hash of one of the objects on the manifest was wrong and that leads to a manifest being rejected.
I did see the same dip for the RIPE validator which also uses RRDP. I did not see the same dip with rpki-client (which loads the rsync repository instead of RRDP and does not update as frequently due to it's usage of rsync).
Hmmm, but isn't RRDP hash mismatch in local file
showing up in pretty much every round of validation? In other words, is there any way to know this hash mismatch is abnormal, while other hash mismatches are acceptable?
That’s not the line causing the issue – that just means Routinator will use a fresh snapshot instead. ‘File has wrong manifest hash’ means that the file published differs from the one referenced on the manifest which causes the entire CA to be rejected.
The upcoming 0.9 release of Routinator will fall back to using the last known good state of the CA if it is still valid and be more robust in this case (at the price of using significantly more disk space). It looks like it also fixes the ‘mismatch in local file’ issue.
Hi there, we've set up a few Routinator instances on our VMs and they observed a drop of Lacnic VRPs (to zero) at around 17:00 UTC, Mar 24. Sample status and logs:
mar24-status.txt mar-24.log
I couldn't find anything special from the logs except a huge diff (
Diff with 9 announced and 12033 withdrawn
).This is not the first time we observed such drops. Is there any recommended approach to detect and handle such abnormal behaviors? Or some indicators that the current cache is not complete?
Thanks!