oasis-tcs / csaf

OASIS CSAF TC: Supporting version control for Work Product artifacts developed by members of TC, including prose specifications and secondary artifacts like meeting minutes and productivity code
https://github.com/oasis-tcs/csaf
Other
148 stars 39 forks source link

CSAF 2.0 Specification: Unclear scope & method for security sanitizing of CSAF documents #174

Closed bentolor closed 3 years ago

bentolor commented 3 years ago

Chapter 4 of the current prose draft for CSAF 2.0 mentions several security precautions for CSAF producer/consumers.

Overview

I see two major issues with them

  1. Most of them refer exclusively to Markdown processing and the concept of formatted messages not used in the specification
  2. It's unclear how to handle/tackle HTML (or Markdown) content inside JSON Format

Markdown-only relevance

In my perception most of the requirements exclusively refer to embedded Markdown processing.

I did not find any mentions/references to Markdown processing, except in the definition for "formatted message". I also did not find any reference to "formatted", except those definitions.

Handling of HTML-relevant characters for CSAF JSON Producer

JSON only offers methods to quote JSON-relevant characters (like " or \) but not for the much larger set of HTML relevant characters. So there is no canonical approach to escape/quote HTML content inside a JSON file.

I'm a little puzzled about this whole requirement, as it implies assumptions about the platforms used for CSAF Consumer/Producers. My perception is, that shots shoots beyond the aimed target regarding security.

In order to cover all ways which could lead to malicious HTML conent, one would have to cover all of the following characters:

    " ' & < > / ` =

As you can see, this covers some fairly common characters.

If this requirement should be kept, I only see the following options to fulfuill/tackle this. None of them I find aceptable

  1. CSAF Producers silently and stoically filer out all the characters mention above from any JSON output they generated

  2. CSAF Producers replace those characters with their according HTML equivalents when saving JSON:

    & → &amp;
    < → &lt;
    > → &gt;
    " → &quot;
    ' → &#39;
    / → &#x2F;
    ` → &#x60;
    = → &#x3D;
  3. A final option would be, that CSAF Producer count any occurrence of these characters as "invalid document". Then the user could not use any of those characters anywhere. I don't think this variant is desirable, too.

All approaches in my opinion imply unwanted side effects:

More complexity: How would the HTML quotation approach existing &amp; occurrences? And what should happen on consuming CSAF documents? Should the JSON reader unquote HTML entities?

Proposal

Due to the ambiguity and unclear handling I would propose to drop any HTML/Markdown sanitizing requirements for JSON producers.

Nota bene

Nota bene: Chapter 4 also seems to have some template leftovers, which I assume can be removed:

Remove this note before submitting for publication.

cc @tschmidtb51

sthagen commented 3 years ago

Thank you for providing the feedback @bentolor. This issue relates to #139

A fast response to the remark on left overs: they are left in because publication is targeting submission to OASIS Technical Committee Administration within the TC prcess.

Given the broad spectrum of consumers, I doubt we can simply allow HTML as payloads in CSAF documents without any specific enforceable constraints. But, I am not a producer of SA‘s myself, so maybe the other members can provide information on the producer requirements.

Since I did participate in the discussions we had within the SARIF TC on a similar topic (also a JSON format to convey complex information with parts targeting human viewers the strictness on the content directly derives from the expected broad distribution, indirect consumer producer relationships, and tries to reduce the risk / warranty on the advisory consumers / producers as well as the writers of (this) specification.

Example for similar security measures

... per the SARIF OASIS Standard describing the message object

https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html#_Toc34317459

Example for responsibility of a standardized specification

The first OASIS standard i helped bring to life as single editor starting approx. 15 years ago finally won its own CVE. So, IMO any such risk on the standards producing body is a realistic topic:

https://nvd.nist.gov/vuln/detail/CVE-2020-13101 (cpe:2.3:a:oasis-open:oasis_digital_signature_services:1.0:*:*:*:*:*:*:*)

Vulnerability reports on standards are best avoided, because:

Example (citation from the public DSS-X TC Page):

Security Notice: CVE-2020-13101 - The DSS core 1.0 became OASIS standard in 2007. It defines an interface for signature creation and validation for different signature formats and supports multiple variants to transport the documents to be signed or verified. The combination of InlineXML-option (XML-payload within the DSS transport document) and a specially crafted XMLDSig allows an attacker to circumvent the non-repudiation property of the signature. The details regarding this problem are explained in detail in a short (presentation). The recommended mitigation is to move to DSS-X core 2.0. Alternatively, deny the use of the InlineXML option.

tschmidtb51 commented 3 years ago

As we don't have an idea yet how to detect HMTL reliable and prevent that it is emitted from a CSAF producer, I suggest to change the requirement to allow people to implement tools which can claim conformance against one of the targets in the standard. (Otherwise, each tool which can 100% ensure that no HTML is emitted can't satisfy the conformance profile CSAF producer.)

The same problem basically applies also for SQL (you might want to save that in a database), Python (that's probably your parser), and any other programming language. Therefore, I suggest to require the use of Markdown's code options from the issuing party.

This results in more requirements on the side of CSAF consumer. The MUST NOT interpret any value as code and SHALL treat all data as untrustworthy user input.

Please review #278 for the changes in detail.

tschmidtb51 commented 3 years ago

We probably need to add an optional test in #195 for that. Any ideas?