dragonresearch / rpki.net

Dragon Research Labs rpki.net RPKI toolkit
54 stars 26 forks source link

Omnibus list of proposed timer values #376

Open sraustein opened 11 years ago

sraustein commented 11 years ago

There are a bunch of timers (certificate lifetimes, CRL/manifest times, cron poll times) scattered through a bunch of tickets. This is an attempt to pull all of this together, so that we can come up with a coherent theory of how all these timers fit together (and link all the related tickets to this one...).

Certificate lifetimes are notAfter - notBefore.

CRL and manifest lifetimes are nextUpdate - thisUpdate.

RPKI::

BPKI::

Other::

Unrelated(?)::

Proposed values will follow, but first, what timers have I missed?

Trac ticket #361 component rpkid priority major, owner sra, created by sra on 2012-12-03T22:34:31Z, last modified 2013-07-18T15:33:06Z

sraustein commented 11 years ago

On Mon, Dec 03, 2012 at 10:34:31PM -0000, Trac Ticket System wrote:

Proposed values will follow, but first, what timers have I missed?

Do you expect to need to do periodic key rollover?

Trac comment by melkins on 2012-12-03T22:42:30Z

sraustein commented 11 years ago

Do you expect to need to do periodic key rollover?

Fair question. I think the answer is no, we don't anticipate doing clockwork key rolls, but I could be wrong, and even if not, we might want a placeholder (if we knew what it looked like) for a one-off key rollover planned for date X.

Trac comment by sra on 2012-12-03T22:48:26Z

sraustein commented 11 years ago

Not sure if this is applicable, but here are the timer values for the web portal:

Trac comment by melkins on 2012-12-03T22:51:24Z

sraustein commented 11 years ago

So, nibbling away at a few corners of this:

Trac comment by sra on 2012-12-04T12:30:29Z

sraustein commented 11 years ago

CRL lifetimes of an hour will cause pulls from the CA to be pretty frequent.

Trac comment by randy on 2012-12-07T03:38:35Z

sraustein commented 11 years ago

(In #390) I see no stale CRLs for ca0 at the moment.

Routine appearance of stale CRLs and manifests is probably a symptom of a mismatch between settings of various rpkid knobs we've never really bothered to tune, in particular the CRL regeneration interval vs the internal cron cycle: if the CRL lifetime is, eg, half the cron cycle, then on average the CRLs will be stale half the time. There's another ticket (#361) about timer settings, so far we haven't gotten much further than listing all the known knobs.

Trac comment by sra on 2013-03-19T17:24:58Z

sraustein commented 11 years ago

Some time has gone by, and some new experience with rcynic's experimental "rsync-early" option.

So I think it may be time to try setting CRL/manifest update cycle to an hour and rpkid cron cycle to 30 minutes.

Except that this will play hell with workshops where people want instant gratification, maybe need to figure out how we're going to address that before messing with anything. Workshops have a few other weird properties, limited set of TALs, faster timers for everything. Maybe we want an extra package, at least for Ubuntu, which includes workshop tweaks to the normal environment, so that workshop machines can just install the extra package and everything will work?

Trac comment by sra on 2013-06-18T17:11:41Z

sraustein commented 11 years ago

Whence come some of the timer values:

RPKI CA certificate lifetime:: rpki.irdb.Child.valid_until RPKI ROA EE certificate lifetime:: rpki.rpkid.ca_detail_obj.latest_ca_cert.getNotAfter() RPKI manifest EE certificate lifetime:: rpki.rpkid.ca_detail_obj.latest_ca_cert.getNotAfter() RPKI manifest lifetime:: now + rpki.left_right.self_elt.crl_interval RPKI CRL lifetime:: now + rpki.left_right.self_elt.crl_interval BPKI CA certificate lifetime:: now + rpki.irdb.models.ca_certificate_lifetime BPKI EE certificate lifetime:: now + rpki.irdb.models.ee_certificate_lifetime BPKI CRL lifetime:: now + rpki.irdb.models.crl_interval rpkid cron cycle period:: rpki.rpkid.rpkid.cron_period (rpki.conf rpkid::cron-period)

Trac comment by sra on 2013-06-19T19:34:55Z

sraustein commented 11 years ago

rpki.left_right.self_elt crl_interval in turn comes from rpki.irdb.zookeeper.Zookeeper.synchronize_rpkid_one_ca_core(), which supplies the default value (currently six hours) if myrpki::self_crl_interval isn't set in rpki.conf.

rpki.irdb.models.Child.valid_until comes from IRDB data.

The rpki.irdb.models.*_certificate_lifetime values cannot currently be set from rpki.conf, but fixing that would be trivial.

Most of the RPKI certificate expiration is keyed to the resource expiration date in the //parent's// IRDB: that is, the CA certificate is issued by the parent, and the EE certificates just copy the CA certificate's notAfter value. Absent some (as yet undiscovered) reason to want EE certificates to expire early, this makes sense: child wants certificates to last as long as parent will allow. This does mean that child has to reissue when parent reissues, but that's automatic, or is supposed to be. I suppose there might be pathological cases where the parent uses a ridiculously near term expiration date and the child wants to issue certificates with a sane expiration date further in the future, so that if and when the parent renews at the last possible second, everything will still work rather than having some stuff break while the child finds out that the parent finally renewed. But this seems unnecessarily complex if we assume a sane parent. Perhaps a bad assumption, but let's not go looking for trouble.

Trac comment by sra on 2013-06-19T21:05:41Z

sraustein commented 11 years ago

have you thought through the workshop issue?

Trac comment by randy on 2013-06-23T00:48:34Z

sraustein commented 11 years ago

have you thought through the workshop issue?

Somewhat, not entirely.

Assuming for purposes of discussion we can restrict workshop to Ubuntu package case, add third Ubuntu package that requires the other two then whacks rpki.conf settings to force faster cycle. Might also whack rcynic cycle time.

Simplifying assumption: if we go this route, we will not be hand-modifying rpki.conf and rcynic.conf on workshop VMs (that's kind of the whole point, after all), so it's OK for the workshop package to whack those files without worrying about existing content.

Keep in mind that we have config generator tool, so whacking config files automatically is relatively straightforward.

Does this make sense?

Trac comment by sra on 2013-06-23T01:04:33Z

sraustein commented 11 years ago

add third Ubuntu package that requires the other two

uh, in the LIR/ISP case, only rpki-rp is installed, no -ca

Simplifying assumption: if we go this route, we will not be hand-modifying rpki.conf and rcynic.conf on workshop VMs (that's kind of the whole point, after all), so it's OK for the workshop package to whack those files without worrying about existing content.

in current workshops, i explain that we're playing time dilation, and they hack cron, though not foo.conf

Trac comment by randy on 2013-06-23T01:10:09Z

sraustein commented 11 years ago

If we're just talking RP, it's almost not worth the trouble of a workshop package, although we could of course do it anyway if it'd help. rcynic poll frequency is the only one of these ten zillion timers that's really under RP control.

Trac comment by sra on 2013-06-23T01:13:20Z

sraustein commented 11 years ago

well, how quickly does ca push to pub point?

Trac comment by randy on 2013-06-23T01:14:34Z

sraustein commented 11 years ago

well, how quickly does ca push to pub point?

As soon as you change anything, assuming everything is working correctly today.

Trac comment by sra on 2013-06-23T01:19:13Z

sraustein commented 11 years ago

so rcynic timing is the only knob workshops need. and that's in cron and tuning it is already in the lab slide deck. and it is a parm of which the user should be aware. though maybe not, if we don't want them to think they're important and crank it up in the real world.

Trac comment by randy on 2013-06-23T01:21:25Z

sraustein commented 11 years ago

so rcynic timing is the only knob workshops need. and that's in cron and tuning it is already in the lab slide deck. and it is a parm of which the user should be aware. though maybe not, if we don't want them to think they're important and crank it up in the real world.

See, I told you that you would find a reason to like the only-rsync-on-stale-manifest option :)

Trac comment by sra on 2013-06-23T01:25:53Z

sraustein commented 11 years ago

:)

Trac comment by randy on 2013-06-23T01:33:58Z

sraustein commented 11 years ago

(In #347) Removing dependency on #361 as it's really a separate issue.

This ticket has drifted considerably from its original topic, and original report was a bit confused (report asked about BPKI certificates while email included in it was talking about RPKI certificates), but original question remains unanswered:

Should not the expiration checker give some advice about what the user can //do// about impending problems?

Trac comment by sra on 2013-07-18T15:26:07Z