chef / automate

Chef Automate provides a full suite of enterprise capabilities for maintaining continuous visibility into application, infrastructure, and security automation.
https://automate.chef.io/
Apache License 2.0
227 stars 113 forks source link

ServiceNow inspec notifications are missing profile names and have incorrect impact values #3994

Closed satellite15 closed 4 years ago

satellite15 commented 4 years ago

Describe the bug

ServiceNow inspec scan notifications are broken and no longer showing in the ServiceNow incident app. The cause of this appears to be changes in the notification received by the notification service. I have noticed at least 2 issues.

  1. The profile "name" is an empty string. This should be the name of the profile e.g. linux-baseline
  2. The control "impact" is 0 for each control. This should be a value between 0.1 and 1
  3. Totals are incorrect for critical and critical_failed

The impact of each is

  1. The service now app uses the profile name in the process of creating inspec records. The name needs to be populated correctly. Records can not be created without it. Customers with the app installed will not receive inspec scan records in the incident app until this is resolved.

  2. The impact set to 0 in each control means they will be dropped form the notification to be sent to service now by the notification formatter. Anything with an impact < 0.1 is pruned from the notification. The detail of failed inspec scans will never be sent to ServiceNow.

  3. Incorrect totals.

notification.log This log file shows the notification before it is formatted for service now. Note the profile name is "", also impact for each control is 0. Some of these controls are critical controls and should not have a 0 impact.

It looks like the issue is in the data coming into the notification-service and is not an issue with the notifications-service.

To Reproduce

Steps to reproduce the behavior:

Use requestbin or similar as an endpoint for notifications

The observed notification will be similar to

{
  "type": "compliance_failure",
  "total_number_of_tests": 54,
  "total_number_of_skipped_tests": 1,
  "total_number_of_passed_tests": 26,
  "total_number_of_failed_tests": 27,
  "timestamp_utc": "2020-06-26T11:49:42.000000Z",
  "number_of_failed_critical_tests": 0,
  "number_of_critical_tests": 0,
  "node_uuid": "6e55ae35-908e-4841-9b47-c525408d2c66",
  "node_name": "ubuntu-172.18.2.120",
  "inspec_version": "4.18.114",
  "failure_snippet": "InSpec found a control failure on [ubuntu-172.18.2.120](https://ip-172-18-2-200.eu-west-1.compute.internal/compliance/reporting/nodes/6e55ae35-908e-4841-9b47-c525408d2c66)",
  "failed_critical_profiles": [],
  "end_time_utc": "2020-06-26T11:49:42.000000Z",
  "automate_fqdn": "ip-172-18-2-200.eu-west-1.compute.internal",
  "automate_failure_url": "https://ip-172-18-2-200.eu-west-1.compute.internal/compliance/reporting/nodes/6e55ae35-908e-4841-9b47-c525408d2c66"
}

Note that there are no failed critical profiles. In the case of service now this should be populated with any profiles that have failed, minor, major and critical.

Expected behavior

Expect the any failed profiles to be in the failed_critical_profiles[] array Expect the name of each failed profile to be populated correctly Expect the impact of each control to be populated correct to indicate minor, major or critical control

Screenshots

If applicable, add screenshots to help explain your problem.

Versions (please complete the following information):

Additional context

Add any other context about the problem here.

event.log The event.log shows the an incoming event to the notifications service, profile name = "" and impact = 0.0 for all controls

Aha! Link: https://chef.aha.io/features/SH-2416

vjeffrey commented 4 years ago

im trying to look through the code and debug/fix this.

i noticed one thing right away: you're referring to a failed_critical_profiles field here, but i don't see that field in the notifications code. i only see failed_profiles: https://github.com/chef/automate/search?q=failed_critical_profiles&unscoped_q=failed_critical_profiles https://github.com/chef/automate/search?q=failed_profiles&unscoped_q=failed_profiles

vjeffrey commented 4 years ago

i'm wondering if the intent at some point was to copy the data in failed_profiles to failed_critical_profiles, or copy some subset of that data there, and it was never done? the failed profiles array does indeed have all the data

satellite15 commented 4 years ago

im trying to look through the code and debug/fix this.

i noticed one thing right away: you're referring to a failed_critical_profiles field here, but i don't see that field in the notifications code. i only see failed_profiles: https://github.com/chef/automate/search?q=failed_critical_profiles&unscoped_q=failed_critical_profiles https://github.com/chef/automate/search?q=failed_profiles&unscoped_q=failed_profiles

It is failed_profiles, that is the field from the raw notification that gets processed by the servicenow.compliance.ex formatter and becomes failed_critical_profiles in the notification generated by the notification-service

failed_critical_profiles: Utils.to_map(notification.failed_profiles)

Note that the name = "" and impact = 0 before the servicenow formatting occurs. I've already verified that. Formatting for service now excludes all the failed profiles because the impact = 0

vjeffrey commented 4 years ago

ah ok, 👍

vjeffrey commented 4 years ago

so, we've found the issue! we recently introduced some logic that allowed for report size to be reduced by stripping out profile info. unfortunately we were sending the report to notifications before we ever glue that data back together.

vjeffrey commented 4 years ago

https://github.com/chef/automate/compare/vj/fix-notif when you get some time, if you could test with the above branch, that would be awesome! according to my tests this fixes it, but i'm not testing with the full servicenow setup ill get a pr up soon. trying to think of tests...