elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.57k stars 8.09k forks source link

[Cloud Security] initializeCspIndices error handling #153348

Open opauloh opened 1 year ago

opauloh commented 1 year ago

After the recent changes the cloud_security_posture transform plugin, the initializeCspIndices function looks like this:

export const initializeCspIndices = async (esClient: ElasticsearchClient, logger: Logger) => {
  await Promise.allSettled([
    createPipelineIfNotExists(esClient, scorePipelineIngestConfig, logger),
    createPipelineIfNotExists(esClient, latestFindingsPipelineIngestConfig, logger),
  ]);

  const [
    createFindingsLatestIndexPromise,
    createVulnerabilitiesLatestIndexPromise,
    createBenchmarkScoreIndexPromise,
  ] = await Promise.allSettled([
    createLatestIndex(esClient, logger, latestIndexConfigs.findings),
    createLatestIndex(esClient, logger, latestIndexConfigs.vulnerabilities),
    createBenchmarkScoreIndex(esClient, logger),
  ]);

  if (createFindingsLatestIndexPromise.status === 'rejected') {
    logger.error(createFindingsLatestIndexPromise.reason);
  }
  if (createVulnerabilitiesLatestIndexPromise.status === 'rejected') {
    logger.error(createVulnerabilitiesLatestIndexPromise.reason);
  }
  if (createBenchmarkScoreIndexPromise.status === 'rejected') {
    logger.error(createBenchmarkScoreIndexPromise.reason);
  }
};

Now using Promise.allSettled we can handle each error independently, should we handle the rejected state of the createPipelineIfNotExists functions? Are the subsequent functions affected by the creation of the pipeline somehow?

Also, should we start collecting Telemetry data on these errors to know more about what circumstances makes them happen and how we can improve error handling in the future?

elasticmachine commented 1 year ago

Pinging @elastic/kibana-cloud-security-posture (Team:Cloud Security)

CohenIdo commented 1 year ago

Hey @opauloh, regarding having telemetry data on those errors: all our logs are automatically sent to centralize env, here you can find a dashboard the contains our logs.

Regarding the error handling, I agree, if one index failed it shouldn't impact the rest indices.

opauloh commented 1 year ago

Hey @opauloh, regarding having telemetry data on those errors: all our logs are automatically sent to centralize env, here you can find a dashboard the contains our logs.

That's great @CohenIdo, thanks