Open vjeffrey opened 4 years ago
currently, the chef-client UUID can be set to the habitat ring member ID, but this is opt-in only.
The client.rb template is here: https://github.com/chef/chef/blob/cd444f5dfd39e4494e0a8495b20b39259dab923a/habitat/config/client.rb#L17-L19 and the default config is https://github.com/chef/chef/blob/master/habitat/default.toml#L36
I am not sure why this setting is disabled by default but if we can change it then we would be able to correlate services from the applications tab database with infra nodes.
Seems that effortless will use its own client.rb template that uses the hab UUID if automate is enable at all: https://github.com/chef/effortless/blob/e181a3dd3473fd9233f34d6b36ed2dfd06cc3d3c/scaffolding-chef-infra/lib/linux/client-chunk.rb#L13-L17
For inspec runs, we used to use the chef_guid file but I think that was pulled out. I don't know if that is something we have added back in yet. So, the inspec-infra UUID link exists when the audit cookbook is used but not inspec runs.
For the infra schema, we have the source_id, region_id, account_id or tenant_id buried in the ohai data currently. Does it make sense to pull these to a top level field (or object) for querying in the future? This would make querying the infra data set easier in the future.
For the infra schema, we have the source_id, region_id, account_id or tenant_id buried in the ohai data currently. Does it make sense to pull these to a top level field (or object) for querying in the future? This would make querying the infra data set easier in the future.
The ingest service already stores the instance id and region on the node object, so once we have the account id on there, that should take care of things (that's done over here).
~Or are you talking pulling it into something else/at a diff level?~
I see what you mean now, b/c we have ohai data for the azure nodes too -- YES!
follow up issues to create:
User Story
There are a few different ways a node can get added to the nodes postgres table (this is the data that serves the
api/v0/nodes/search
api).compliance/scan-jobs/nodes/add
and entering information about some nodessettings/node-integrations/add
A user recently reported that when they created an aws-ec2 integration, they ended up with duplicate node records under the
nodes/search
api. Since chef-client was running on those nodes, we already had a record of them, but failed to recognize those nodes as the same object when ingesting information.The nodes table has two different unique constraints:
source_id
,source_region
, andaccount_id
. those fields, when referencing a node in aws, correlate to the instance-id of the node, the region in which the node exists (e.g. us-east-1), and the id of the aws account in which it exists. for azure, this is the node id, region, and tenant id.This specific user problem can be traced to a missing data field on the message sent from the client run ingestion path to the nodemanager. We are only sending the instance id and region, no account id. The work to add the account id (and ensure we're storing all the correct information when ingesting the node data) is in progress.
But there are other cross-points where the duplicate node object problem still exists. @kmacgugan wrote a thing about this
It's important that we address these duplicate node situations as much as possible for the one node view epic. As part of that epic, we'll be exposing all those nodes records in the ui, which will make any duplicate node object issues more apparent.
Definition of Done