puppetlabs / puppetdb

Centralized Puppet Storage
http://docs.puppetlabs.com/puppetdb
Apache License 2.0
299 stars 225 forks source link

(maint) Preserve random-hosts generated for initial-hosts-channel #3917

Closed jpartlow closed 9 months ago

jpartlow commented 9 months ago

When benchmark starts, it generates a list of initial host-maps based on random selections from the catalog/factset/report sample data. This means that every time benchmark is run a new ordering of catalogs and facts are going to be pushed to PuppetDB. This can cause a great deal of initial load replacing the catalogs and facts wholesale until PuppetDB has caught up to processing one entire simulated node interval.

This makes it impractical to stop and restart benchmark when working with large simulated runs.

Benchmark continues to massage these host-maps for each simulation loop, and makes use of TempFileBuffer for communication of host-maps between the read and write channels. By default this is written to a generated tmp directory, and the transitional file state only exists for as long as it takes the channels to pull them.

By setting a new --simulation-dir, benchmark will instead write every host-map to that given directory without removing any files.

At startup, when initializing host, any host-# present as a preserved host-map file will be read from the --simulation-dir rather than initialized randomly. Any host-# not found as a preserved host-map file will be initialized randomly as usual.

The final host-map files have been renamed to be just host-# to allow for this indexing.

Preserved state is loaded in the start-simulation-loop's async/pipeline-blocking's transformation function to avoid retaining a reference to the data in the random-hosts sequence.

We then fill out preserved maps where allowed by --facts, --reports, --catalogs in case they weren't present during previous run(s).

And we remove content from preserved maps to match just the requested --facts, --reports, --catalogs or --archive content. (The preserved state may include data not requested by the current run.)

jpartlow commented 9 months ago

Doing some manual testing now.

jpartlow commented 9 months ago

My initial implementation was loading all the preserved host-map files during initialization for simplicity sorting out changes to things like --nummsgs and which --catalog, --facts, --reports flags were set on re-run. But this blows up the heap for large runs (10,000 hosts of generate/realistic is about 4.6G of frozen host-map files, for example...).

Rewrote this to defer loading of the preserved-host-map to the simulation loop. This did get through 10,000 simulated hosts, but I think there's still a memory leak in it. Will look at it some more tomorrow.

Also need to fix up the tests again.

jpartlow commented 9 months ago

Latest version defers loading of the preserved host-map file to the simulation-loop, outside of the sequence generated by random-hosts, and looks to resolve the oom issue.

I was able to run 20,000 nodes on my standard 8/16G test host with this. There might still be a timing or load issue with benchmark though, since, though I've got 20,000 preserved files after that run, I'm only seeing 19981 catalogs in puppetdb. I don't know whether this is a pre-existing problem though.

jpartlow commented 9 months ago

This patch worked to restart twin benchmarks pushing 50,000 each.