puppetlabs / puppetdb

Centralized Puppet Storage
http://docs.puppetlabs.com/puppetdb
Apache License 2.0
301 stars 226 forks source link

Facts end point slower with PuppetDB 8 (>4) #4001

Closed traylenator closed 1 month ago

traylenator commented 1 month ago

Versions:

We have been stuck on PuppetDB 4 for a long time due to https://puppet.atlassian.net/browse/PDB-4830

To recap taking one of our more complicated queries we have in production:

# query.pp
$_hostgroups = ['alpha', 'beta', 'gamma', 'delta', 'epsilon', 'zeta', 'eta' , 'theta', 'iota', 'kappa']
$_hostgroup_query = $_hostgroups.map | $_hg | { "hostgroup_0=${_hg}" }.join(' or ')
$_facts      = ['hostgroup', 'os', 'service_name', 'networking', 'ec2_metadata', 'agent_specified_environment']

$_foo = query_facts($_hostgroup_query,$_facts)

notify{"FOO ${_foo}":}

It was previously suggested to use the inventory endpoint instead and indeed rewriting the above query.

# inventory.pp
$_facts      = ['hostgroup', 'os', 'service_name', 'networking', 'ec2_metadata', 'agent_specified_environment']
$_factpaths = $_facts.map | $_f | { "facts.${_f}" }

$_hostgroups = ['alpha', 'beta', 'gamma', 'delta', 'epsilon', 'zeta', 'eta' , 'theta', 'iota', 'kappa']
$_query = $_hostgroups.map | $_hg | { "facts.hostgroup_0 = \"${_hg}\"" }.join(' or ')

$_result = puppetdb_query("inventory${_factpaths}{${_query}}")

notify{"Result "${_result}":}

So to the questions:

The pain of migrating is major - we have > 100 calls to query_facts in our manifests and also a load of external infrastructure that calls the facts end point to populate help desk information for instance. Given of course PDB4 does not have a inventory endpoint to the required functionality this of course makes any migration super hard.

For info:

bastelfreak commented 1 month ago

Some thoughts:

traylenator commented 1 month ago

Just closing.

So the facts endpoint is slower with 8 than 4 but it is faster to write to. It will remain like that.

Migrate to inventory endpoint for all queries and avoid use of the query_facts and query_nodes functions.

To facilitate migration we will run puppetdb 8 but redirecting the /nodes and /facts point on to a PDB4 instance which is also being maintained via double publication - remove all our calls /nodes and /facts and drop the PDB4 instance.