example42 / puppi

Puppet module to manage applications deployments and servers local management
http://www.example42.com
Other
142 stars 84 forks source link

use milliseconds in last_run #126

Closed wkalt closed 9 years ago

wkalt commented 9 years ago

The fact that the last_run fact only has second resolution can cause GC issues for large-scale users of PuppetDB.

One drawback of the deduplication PuppetDB does for storage conservation is that an orphan row is created every time two or more nodes that exclusively share a fact value update that value at the same time. These orphans are deleted by the periodic GC (period defined by gc-interval). Under normal circumstances, creation of orphans is rare and the GC handles it fine. When last_run has second-granularity though, this is guaranteed to happen at least onces every time more than 60 nodes check in per minute. When well over 60 nodes are checking in per minute it can become problematic. Changing last_run to include milliseconds should eliminate the issue, since two nodes will never share the same last_run fact.

See https://tickets.puppetlabs.com/browse/PDB-1124 for more discussion.

I have no idea whose code this may break, so caution is obviously warranted.

wkalt commented 9 years ago

it may also be worth noting that the same information is provided on the /nodes endpoint under the key "facts_timestamp"

alvagante commented 9 years ago

thanks for the notice. This is an old fact, which is now useless as you outlined. This is going to be kept for backwards compatibility.