usnistgov / OSCAL

Open Security Controls Assessment Language (OSCAL)
https://pages.nist.gov/OSCAL/
Other
660 stars 178 forks source link

Exposing OSCAL data with OpenTelemetry #2039

Open gyliu513 opened 2 weeks ago

gyliu513 commented 2 weeks ago

User Story

As an OSCAL user, I want to expose all of the OSCAL data in OTEL format and see all of the data via some otel backends, like grafana etc.

Goals

Enable OSCAL can embrace the OTLP protocol and expose its data to different platforms.

Dependencies

No response

Acceptance Criteria

(For reviewers: The wiki has guidance on code review and overall issue review for completeness.)

Revisions

No response

gyliu513 commented 2 weeks ago

Does anyone know if OSCAL has any plan to integrate with OpenTelemetry? Thanks!

iMichaela commented 2 weeks ago

Does anyone know if OSCAL has any plan to integrate with OpenTelemetry? Thanks!

@gyliu513 - At this time, NIST does not have a plan to support OTel, but if the community is interested in researching this topic, we can support it.

OpenTelemetry provides a common framework for collecting telemetry data and exporting it to an Observability back end of your choice. It uses a set of standardized, vendor-agnostic APIs, SDKs, and tools for ingesting, transforming, and transporting data. Since the telemetry data consists of the logs, metrics and traces collected from a distributed system, I am assuming your are proposing OTel for the assessment results collection and insertion into the OSCAL Assessment Plans and/or POA&Ms?

gyliu513 commented 2 weeks ago

I am assuming your are proposing OTel for the assessment results collection and insertion into the OSCAL Assessment Plans and/or POA&Ms?

@iMichaela Yes, this is what I was hoping we can integrate, any suggestion for this? Thanks

gyliu513 commented 2 weeks ago

@iMichaela do you know if there are any tools which can be used to collect OSCAL data automatically? Thanks

aj-stein-gsa commented 2 weeks ago

User Story

As an OSCAL user, I want to expose all of the OSCAL data in OTEL format and see all of the data via some otel backends, like grafana etc.

This came up in a FedRAMP implementers meeting and sounds interesting (at least personally to me, I am one of those OTEL people in personal lab environments from time to time with Prometheus and Grafana). Do you have an idea of what kind of security information you would want to see and how it relates to security controls for a notional system?

@iMichaela do you know if there are any tools which can be used to collect OSCAL data automatically? Thanks

As someone who reviews a lot of tools and integrations, I have not seen any yet, but that is why I asked the previous question.

iMichaela commented 2 weeks ago

This came up in a FedRAMP implementers meeting and sounds interesting (at least personally to me, I am one of those OTEL people in personal lab environments from time to time with Prometheus and Grafana). Do you have an idea of what kind of security information you would want to see and how it relates to security controls for a notional system?

My assumption - per communication above - was that the intention is to collect logs, metrics, and traces/evidence required. It should match the information planned to be collected for control assessments, to satisfy the regulatory framework requirements.

@aj-stein-gsa - if you recall the ATARC pilot, I envision the need for providing inputs to guide the outputs. Personally I need to do more reading, but I am also very interested in researching it . It would be a great OSCAL research topic. I am going to raise it with CNCF OSCAL WGs as well.

gyliu513 commented 2 weeks ago

Thanks @aj-stein-gsa and @iMichaela for the discussion here, really helpful. Let me share a use case here:

Suppose I have a VM, and I was using otel collector to collect some metrics for this VM, like VM name, cpu, memory etc. I also want to get some OSCAL assessment results for this VM as well, and then do correlation for those data to show the customer an overview for this VM entity.

But if we can provide a solution of using otel to collect data for OSCAL as well, then we can probably define a unified data collector layer and data correlation layer to handle this request.

An example for a VM otel metrics data and OSCAL data as below, hope this helps.

An example of OSCAL Security Plan

{
  "system-security-plan": {
    "metadata": {
      "title": "Virtual Machine System Security Plan",
      "last-modified": "2024-09-04T00:00:00Z",
      "version": "1.0",
      "oscal-version": "1.0.0"
    },
    "system-characteristics": {
      "system-name": "Example Virtual Machine",  // VM name here
      "system-description": "This is a virtual machine running critical applications.",
      "system-information": {
        "system-type": "Virtual Machine",
        "system-host": "VMware ESXi",
        "operating-system": "Ubuntu 22.04 LTS"
      }
    },
    "control-implementation": {
      "implemented-controls": [
        {
          "control-id": "AC-2",
          "description": "Implement access control for the VM.",
          "responsible-roles": ["VM Administrator"]
        },
        {
          "control-id": "SI-7",
          "description": "Ensure the integrity of VM's software and updates.",
          "responsible-roles": ["Security Officer"]
        }
      ]
    }
  }
}

And then I got assessment result for my VM as below with OSCAL

{
  "assessment-results": {
    "metadata": {
      "title": "Virtual Machine Assessment Results",
      "last-modified": "2024-09-04T00:00:00Z",
      "version": "1.0",
      "oscal-version": "1.0.0"
    },
    "results": [
      {
        "control-id": "AC-2",
        "status": "satisfied",
        "findings": "User access control measures are in place and effective."
      },
      {
        "control-id": "SI-7",
        "status": "partially satisfied",
        "findings": "Software integrity checks are in place, but one outdated package was found."
      }
    ]
  }
}

And get OSCAL AD as following:

{
  "authorization-decision": {
    "metadata": {
      "title": "Virtual Machine Authorization Decision",
      "last-modified": "2024-09-04T00:00:00Z",
      "version": "1.0",
      "oscal-version": "1.0.0"
    },
    "authorization-result": {
      "decision": "authorized with conditions",
      "description": "The VM is authorized for use, but the outdated package must be updated within 30 days.",
      "justification": "No critical vulnerabilities were identified, but some remediation is required."
    }
  }
}

Here is the data of the VM that I get from otel

{
  "resourceMetrics": [
    {
      "resource": {
        "attributes": [
          {"key": "vm.name", "value": "Example Virtual Machine"},  // VM Name here
          {"key": "host.name", "value": "vm-host-01"},
          {"key": "os.type", "value": "linux"}
        ]
      },
      "scopeMetrics": [
        {
          "metrics": [
            {
              "name": "vm.cpu.usage",
              "description": "CPU usage of the VM",
              "unit": "percentage",
              "dataPoints": [
                {"timestamp": 1693804800, "value": 55.3}
              ]
            }
          ]
        }
      ]
    }
  ]
}

After correlation, the VM data will be as following:

{
  "vm.name": "Example Virtual Machine",
  "oscal-controls": {
    "AC-2": {
      "status": "satisfied",
      "description": "User access control measures are correctly implemented."
    },
    "SI-7": {
      "status": "partially satisfied",
      "description": "Software integrity checks are in place, but one outdated package was found."
    }
  },
  "otel-metrics": {
    "cpu.usage": "55.3%",
    "memory.usage": "2GB",
    "network.throughput": "150Mbps"
  },
  "otel-traces": [
    {
      "trace-id": "1234567890abcdef",
      "span-id": "abcdef1234567890",
      "operation": "vm-login",
      "status": "ok",
      "start-time": "2024-09-04T10:00:00Z",
      "end-time": "2024-09-04T10:00:05Z"
    }
  ],
  "otel-logs": [
    {
      "timestamp": "2024-09-04T10:00:00Z",
      "log-level": "info",
      "message": "User admin logged into VM."
    }
  ]
}
gyliu513 commented 2 weeks ago

@iMichaela do you have some meeting notes or github links for CNCF OSCAL WGs? Thanks

aj-stein-gsa commented 2 weeks ago

Thanks @aj-stein-gsa and @iMichaela for the discussion here, really helpful. Let me share a use case here:

Suppose I have a VM, and I was using otel collector to collect some metrics for this VM, like VM name, cpu, memory etc. I also want to get some OSCAL assessment results for this VM as well, and then do correlation for those data to show the customer an overview for this VM entity.

But if we can provide a solution of using otel to collect data for OSCAL as well, then we can probably define a unified data collector layer and data correlation layer to handle this request.

Neat, so you essentially want custom metrics to consume with an OTEL collector with perhaps a custom receiver?

gyliu513 commented 2 weeks ago

Neat, so you essentially want custom metrics to consume with an OTEL collector with perhaps a custom receiver?

@aj-stein-gsa Yes, but maybe not only receiver, but also processor, as there maybe some semantic convention required in the processor. We probably need a oscalreceiver?

iMichaela commented 2 weeks ago

@iMichaela do you have some meeting notes or github links for CNCF OSCAL WGs? Thanks

@jflowers leads the cncf/tag-security OSCAL Norms project. I raised the issue today and I believe there is a lot of interest.

iMichaela commented 2 weeks ago

Thanks @aj-stein-gsa and @iMichaela for the discussion here, really helpful. Let me share a use case here: Suppose I have a VM, and I was using otel collector to collect some metrics for this VM, like VM name, cpu, memory etc. I also want to get some OSCAL assessment results for this VM as well, and then do correlation for those data to show the customer an overview for this VM entity. But if we can provide a solution of using otel to collect data for OSCAL as well, then we can probably define a unified data collector layer and data correlation layer to handle this request.

Neat, so you essentially want custom metrics to consume with an OTEL collector with perhaps a custom receiver?

There are few comments I have to the data sample you provided, which might be fundamental to the problem. I'll put aside for the moment the incorrect OSCAL structure, I am only looking at the data you are trying to convey

The system-security-plan/control-implementation/implemented-requirements/by-component/description here reads "Implement access control for the VM." This description MUST document how such control/requirement is implemented. The HOW is AC-2 implemented is what needs to be assessed in ways that adhere to the assessor's plan of assessing it and what evidence is required by this regulatory framework. So this information is needed and either the otel oscalreceiver and/or the processor will need to use it to know what to check, collect as evidence and as input for the adjudication.

The assessment-results/results/assessment-log will collect logs, the assessment-results/results/observation will gather the relevant-evidence, date, collected, method, etc., the assessment-results/results/findings will link those to the relevant-observations and implementation-statement-uuid, etc.. Finally, the assessment-results/results/attestation would need to capture the outcome of the assessment (AD data).

In the example, you are providing otel metrics, but similar metrics might be already defined by different authorities (FedRAMP, CSA STAR, etc) and might need to be mapped or used as inputs...

I hope this is all doable .. The reason for calling on CNCF experts. I personally like the idea very much, but it is not straight forward..

gyliu513 commented 2 weeks ago

The assessment-results/results/assessment-log will collect logs, the assessment-results/results/observation will gather the relevant-evidence, date, collected, method, etc., the assessment-results/results/findings will link those to the relevant-observations and implementation-statement-uuid, etc.. Finally, the assessment-results/results/attestation would need to capture the outcome of the assessment (AD data).

Thanks @iMichaela , what I provided is just an example to clarify my use case, there maybe some errors, but please ignore that. :)

In the example, you are providing otel metrics, but similar metrics might be already defined by different authorities (FedRAMP, CSA STAR, etc) and might need to be mapped or used as inputs...

This is a good point. Yes, we can get same data from different sources, I think that is why we need semantic convention and data correlation to mitigate those issues.

iMichaela commented 2 weeks ago

This is a good point. Yes, we can get same data from different sources, I think that is why we need semantic convention and data correlation to mitigate those issues.

And most likely, enforced control metrics will need native support in OSCAL or use of a registry of extensions, otherwise tools might not know how to extract the information and use it (pass it as input , use it for the final AD, etc) automatically. Just for keeping records together, here is the CSA/cloud-audit-metrics project (their own JSON schema to align with the

ogijaoh commented 1 day ago

What is the proposed change to OSCAL? I am not clearly seeing the need to change OSCAL. This seems like an effort to develop a tool that will make use of assessment result data in OSCAL formats. Do these efforts belong in this repository, or should a separate repository be started to accomplish this?

OSCAL is a set of data structures. OpenTelemetry is a set of tools for measuring the performance and behavior of software. The way I am understanding the discussion, it seems the way forward is to work telemetry outputs into the software applications that consume/create OSCAL-structured data, not modify the OSCAL structures themselves.

edited for grammar

gyliu513 commented 23 hours ago

@ogijaoh Thanks for the comments, totally agree with you.

I can see you are working for https://github.com/defenseunicorns/lula, and it can Generate machine-readible OSCAL artifacts, seems this can be used as a source for generating OSCAL data, and we need to build a oscalreceiver to get those data. Comments? Thanks