Azure / azure-sdk-tools

Tools repository leveraged by the Azure SDK team.
MIT License
112 stars 172 forks source link

[Breaking Change] Add a mechanism for permanent suppressions (baselining) #7259

Open konrad-jamrozik opened 10 months ago

konrad-jamrozik commented 10 months ago

Currently, there is no way to permanently suppress breaking change violations, as can be seen at: aka.ms/azsdk/pr-suppressions

This is a major gap causing tons of rework. We want to fix it by providing a mechanism for this. One idea we have is a solution that does comparison between the tool error output and a baseline text file checked into the repository. If all the output error lines have a match in the baseline file, the tool reports success.

Preliminary design considerations

There are several design considerations to make before we proceed with implementation. Some of the aspects we identified so far:

Related work, docs and context

@weshaggard @mikeharder @heaths FYI

heaths commented 10 months ago

I don't know much about Sarif, but of what little I do, it seems like a reasonable approach to consolidate error/warning output into a common format. That would make a lot of things easier, including posting to GitHub (probably converting to markdown tables or something, or even pinning to specific PR lines like some bots do), comparing against a suppressions file, etc.

Hashing the surrounding context is a good idea and would probably work for most cases. But what about, when JSON, using a JSONPath. We do that for autorest in many cases, for example, and it rarely changes. I also imagine most suppressions would work fine with a JSONPath e.g.,

suppressions:
  - from: foo.json
    where: $.definitions.Bar.prop_name
    code: 1234 # or maybe friendly code name e.g., UseCamelCase
    reason: need to match external schema
konrad-jamrozik commented 10 months ago

@heaths while it is by no means a rejected approach or anything, main issue with Autorest way is that so far it has been a pain to get right, causing a lot of rework. You can observe our troubleshooting section for it:

and the most recent addition that should show up in few minutes:

weshaggard commented 10 months ago

I would also be careful with hashing line text as that makes things more fragile in a number of cases as well as makes it harder for people to get the suppressions correct. Instead of a hash we should try to help ensure the error message itself has enough context to suppress the correct instance.

michaelcfanning commented 9 months ago

Just jumping in here at Konrad's encouragement on some SARIF things. For sure, your output looks very SARIF friendly, you have conceptual quality concerns, locations where these standards aren't met, user-facing reports, descriptive text for the standards enforced, etc. Using SARIF will unlock the eco-system of consumers, which includes Visual Studio, VS Code, there are multiple react controls for rendering, etc., GitHub actions speak SARIF natively for its GHAS, as does Microsoft's equivalent offering.

In re: JSON paths, SARIF has a construct called a logical location (as opposed to a physical location, such as a file on disk) that can be used to express this information. There are multiple SARIF-exporting tools at MS that use this data to record JSON paths for scan tools.

In re: suppressions, I think your current approach, to produce a JSON path that's specific to a finding, is a reasonable approach that's reasonably resilient. We have had some other JSON scenarios I'm aware of where a tool will elide some path information in an attempt to make the suppression less fragile. For example, a precise index into an array (assuming that this data isn't required to uniquely identify a problem).

The SARIF partialFingerprint property is intended to support arbitrary things that might contribute to suppressions. And SARIF itself is designed to allow for a sort of suppression 'sieve' where results might be matched using things like code snippets, falling back to line locations and/or other data if a more precise match fails. This result matching occurs in more of a baselining context, where you may have a baseline of well-known SARIF, generate a new run, and then diff the two. The SARIF SDK in NuGet has a fair amount of code to help with this.

Finally, it sounds like you're already concerned about the fragility of a per-line hash, the SARIF SDK has a rolling hash algorithm built into it that's intended to help produce more stability for code churn. But this hash will be even more fragile than a single line hash, so probably not of interest. Will have the undesirable characteristic of being opaque to user, which @weshaggard warned again.

I have authored some tools historically that emitted a working suppression in the user-facing output and this could be an option for you. A typical approach would be to emit your finding to the users in a well-formed sentence. And then to add another sentence that describes the precise suppression data that can be cut-and-pasted elsewhere to suppress. According to SARIF, consumers can always choose to select your first sentence of output only (when UX real estate is limited), and provide an expander or something else to get additional content. The VS SARIF viewer uses this approach a lot. In console output, of course, the entire finding will be sent to output.

konrad-jamrozik commented 9 months ago

Note: there are additional "suppressions for SDK" proposals provided by @raych1 in the Word file titled SDK Breaking Change Review Workflow. Link here.

konrad-jamrozik commented 8 months ago

Bigger picture discussion about suppressions here:

konrad-jamrozik commented 8 months ago

I had a chat with @weshaggard how to approach first prototype of this work.

Consider this example breaking change log

{
    "type": "Result",
    "level": "Error",
    "message": "The new version is missing a 'x-ms-enum' found in the old version.",
    "code": "RemovedXmsEnum",
    "id": "1049",
    "docUrl": "[https://github.com/Azure/openapi-diff/tree/master/docs/rules/1049.md",](https://github.com/Azure/openapi-diff/tree/master/docs/rules/1049.md%22,)
    "time": "2024-01-01T11:48:40.019Z",
    "groupName": "stable",
    "extra": {
        "mode": "Removal"
    },
    "paths": [
        {
            "tag": "Old",
            "path": "[https://github.com/Azure/azure-rest-api-specs/blob/main/specification/alertsmanagement/resource-manager/Microsoft.AlertsManagement/stable/2019-03-01/AlertsManagement.json#L34:9"](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/alertsmanagement/resource-manager/Microsoft.AlertsManagement/stable/2019-03-01/AlertsManagement.json#L34:9%22)
        }
    ]
},

mentioning this spec element.

  "paths": {
    "/providers/Microsoft.AlertsManagement/operations": {
      "get": {
        "operationId": "Operations_List",
        "description": "List all operations available through Azure Alerts Management Resource Provider.",
        "parameters": [
          {
            "$ref": "#/parameters/api-version"
          }
        ],

For full context of this example data, see this comment.

For the first prototype, I will do the following:

  1. Emit to the ADO build log of the breaking changes tasks a new property for each message in the ---- Full list of messages ----, that will contain the suppression line ready to be copy-pasted to the suppression file. For the provided example it could be e.g.:

"snippetToApplySuppression" : "$.paths.['/providers/Microsoft.AlertsManagement/operations'].get.parameters"

The value will be a JSON path derived from the available information. Probably we could pull it out from the openapi-diff somehow. You can verify the JSON path at https://jsonpath.com/

The ADO build log will also point to aka.ms link explaining how to apply these snippets, as described below.

  1. Add support for taking this snippetToApplySuppresion and pasting it to the newly added suppressions file. That file will live besides the service README.md. In the provided example it means it would live in specification/alertsmanagement/resource-manager. It could be called suppressions.md. It will be a .yml file and will have appropriate comments in it explaining its purpose.

  2. Add support for reading suppressions.md in the breaking change ADO tasks and excluding any new breaking change violations from triggering if it matches the JSON path available in suppressions.md. Appropriate build log message should be added denoting a breaking change alert was triggered, but it was suppressed.

Design considerations

  1. The AutoRest README.md also has suppressions. They are interpreted by AutoRest and used to suppress LintDiff issues. As a result, we will have suppression logic in multiple places.
  2. We could make adding the suppressions as simple as "append this line to suppressions.yml" but this means we won't have any structure in that file, maybe only comments. The more structure we add and metadata (e.g. key suppressions:) the more risk users will have issues with adding suppressions.
  3. A more fancy option of adding suppressions would be to instead provide the users with instructions which comment to add. Our automation would read that comment and push an update to the PR adding relevant suppression, or make a PR to the user PR with the suppression.
  4. We also need to consider scenarios where users want for the suppressions to be more general than for the one specific element pointed by the JSON path. This is extra relevant because what often happens is that an OpenAPI spec element that has the issues is referenced in many places, and each such reference triggers the breaking change violation. I think we currently may run into an issue of kind: a) the tooling reports the problem on the first reference b) user suppressed the issue c) now the tooling reports issue on the second reference d) user suppressed that e) now the report is on the third reference, and so on. This may happen (not sure yet) because the tooling has deduplication logic. See #7242.
  5. There is at least one more suppressions mechanism, for suppressing issues in the generated SDK languages like C#. See this comment.
  6. I should also investigate using SARIF, see this comment.
weshaggard commented 8 months ago

Let's not abuse md files for these suppressions let's make it a proper yml file so suppressions.yml. As for a suppression entry you will likely also need the file path and the error code type along with the json expression. I do think it is OK if they have to copy multiple lines as long as it is as simple as copy/paste into a file.

konrad-jamrozik commented 7 months ago

Per our chat with @mikeharder @weshaggard: Example of the currently proposed format in the suppressions.yml

- tool: Breaking Change
  filePattern: ./specs/arm/KeyVault/**/
  jsonPath: ...
  code: ParamRemoval
  reason: matching actual impl.
- tool: TypeSpec requirement
  filePattern: ./specs/arm/KeyVault/**/
  jsonPath: ""
  code: ""
  reason: brownfield

The suppressions file to read is determined as follows: for given API spec being evaluated, walk the directory tree until you find the first suppressions.yml file. Ignore others. No merging logic across multiple files. No configuration cascade.

heaths commented 7 months ago

Just a thought: should we consider JMESPath instead? This is what az uses so skills may be more transferrable. Or do we also make use of JSONPath in other places in our ecosystem? They are both close and I don't particular care either way. I was just pondering that service teams using az might be more familiar with JMESPath. There are libraries for using either in code.

konrad-jamrozik commented 7 months ago

Just a thought: should we consider JMESPath instead? This is what az uses so skills may be more transferrable. Or do we also make use of JSONPath in other places in our ecosystem? They are both close and I don't particular care either way. I was just pondering that service teams using az might be more familiar with JMESPath. There are libraries for using either in code.

@heaths We use JSONPath for AutoRest suppressions, which power few of our checks, and this is the primary way how folks have been suppressing things so far:

https://eng.ms/docs/products/azure-developer-experience/design/specs-pr-guides/pr-suppressions#verify-your-where-path-with-jsonpath-and-ensure-proper-escaping

Still, we could consider supporting JMESPath in addition to JSONPath.

heaths commented 7 months ago

Still, we could consider supporting JMESPath in addition to JSONPath.

Eh, I wouldn't bother. :smile: Thanks for noting you already use JSONPath. Certainly in this case that is more consistent. The two are close, and I think supporting both would just lead to confusion. I didn't realize you already used JSONPath, so I only suggested considering it since az did. Never mind. :smile:

konrad-jamrozik commented 4 months ago

This work will build on: