sarif-standard / sarif-spec-v1

The specification document for the Static Analysis Results Interchange Format (SARIF)
Other
56 stars 10 forks source link

Should we add a "priority" property to result? #243

Open ghost opened 8 years ago

ghost commented 8 years ago

We've certainly discussed this before; I don't remember why we didn't add it. It doesn't sound like something a tool would be able to populate; that's what severity is for. (The tool can't know the business priorities that would lead a team to decide "Let's fix these 5 bugs first".) But it's something that a downstream triage process might populate. So far we haven't added anything like that to the format. I suggest that we consider the whole question of "properties populated after the scan is over" for a future revision of SARIF.

@tikiwan @michaelcfanning

TikiWan commented 8 years ago

Here is my 2 cents:

I believe the level/default level is what SARIF used to capture priority/severity. If I am not mistaken, error = level 1, warning = level 2, note = level 3.

For code analysis tool, everything is "warning". It should be the compiler job to check "error".

Similar to compiler, not all warning are the same, that is why compiler warning are put into different level. Policy maker can ignore them and decide which warning people should address because they know their environment. However, I would say no people knows the check better than the person who wrote the check.

Nevertheless, tool owner can use the properties field in SARIF to achieve this.

ghost commented 8 years ago

TL;DR: I still think it's worth discussing whether SARIF should represent "priority" as a first-class property.

With regard to your second paragraph (priority vs severity): a single field can't capture both priority and severity because they are two different concepts. Severity means "how bad is the mistake", and priority means "how important is it for me to fix it". Even a very bad mistake might have a low priority if (for example) it is in dead code, or if it is in a rarely used feature, or if it is in a module that's going to be rewritten from scratch.

With regard to your third paragraph (warning vs error): if you are talking about general-purpose source code analyzers, you're probably right: analyzers should generally emit only warnings. But a domain-specific analyzer might very well include rules whose severity is "error". For instance, the Static Driver Verifier tool (SDV) can detect cases where a device driver misbehaves (for instance, fails to release a lock). Even though the code is syntactically valid C, a driver with that problem won't get the Windows logo, and it's appropriate for SDV to report it as an error.

With regard to your third paragraph (warning levels): What you're describing is a division of SARIF's "warning" level into "sub-levels" such as "Level 2 warning", "Level 3 warning", etc. You're right, SARIF doesn't directly support that concept, and you could use the result's property bag to hold it.

michaelcfanning commented 8 years ago

Besides the notion of severity and priority, there is also the notion of 'certainty', which is a measure of the degree of precision around a problem that's been detected. to date, 'priority' is arguably a result mgmt. concept and not one that should be hard-coded into tools.

Historically (particularly internally at Microsoft), tools have attempted to capture information about specific checks that relate to broader group context: e.g., what rules are most important to a group. Our experience suggests that this is a mistake, as updating this information requires servicing the tool itself (which is extremely burdensome when tooling is integrated in a key business process, such as a release build). Furthermore, there is typically more than a single view for a set of results (e.g. a team standard that's tracking against an immediate deadline, vs. a broader group goal to get clean on some quality standard).

If we agree that providing for some flexibility that allows stability of results production + flexibility in assessing the state/plan for those results at any moment in time, then a question arises around what is the core categorization that can/should be emitted by a tool. After canvassing numerous tools for the SARIF project, it became clear that nearly every tool affored some basic sorting around 'error', 'level' and a lower-priority/verbose level.

All of the above is to say that I believe we have the key concept ('level') in the core SARIF format. This is an appropriate and sufficient designation to be burned into a tool (aside from the possibility of adding something around precision of analysis, since the tool understands when it falls back to heuristics or less-than-highest confidence analysis). From here, the tool doesn't have sufficient context to bucket/plan for specific results. A tool doesn't know when it's analyzing code that doesn't ship (lowering its priority). A tool doesn't know that the team needs to release next week, and therefore the min-bar needs tightening. A tool isn't aware that a product is shipping internationally in its next release for this first time, raising the priority of globalization/localization/other culture-specific issues. Etc.

ghost commented 8 years ago

I agree that a tool has no business emitting "result.priority".

I'd like to discuss whether the format should support "result.priority", even if we discourage tools from emitting it. Then a post-processing tool could consult information such as a policy, or the amount of time remaining until the next release, or whether this is shipping code, or even information manually entered by a program manager, and create an output file that enhances the result object with a priority property.

That's what I meant at the top of this thread when I wrote that this is just one example of "properties populated after the scan is over", and I suggested discussing this for a future version of SARIF.

TikiWan commented 8 years ago

I agree with Michael. I don't think it is SA tool responsibility to output priority or severity, it is more or less responsibility of issue management system. I don't think we would like to make those properties first class citizen in SA log.

I will argue that if we perform post-processing, we should put those post-processing data into properties bag under some namespace, or nested SARIF.

Otherwise, if post-processing tool one say it is priority 1, and post-processing tool two say it is priority 2. The priority of the issue will be determined by which post-processing tool get run first.

If the SA tool itself doing the post-process, just alter the level.

BTW, I still think that our current property of "level" and "default level" is confusing. Instead of Error, Warning, Note, N/A, can we simply do level 1, 2, 3 and 4?

ghost commented 8 years ago

We all agree that the tool isn't responsible for "priority". But I don't think Michael said -- and I don't agree with the statement -- that the tool isn't responsible for "severity". That's what "level" is, and we chose the values "error" and "warning" to indicate a severity level. IIRC, the only reason we called the property "level" instead of "severity" was that we didn't like the idea of (for example) "not applicable" as a "severity".

In retrospect, this was probably a bad decision, because apparently "level" suggests to you that it should be a numeric value, which is understandable. If I had it to do over, I'd probably have called the property "severity".

In summary: Changing either the name "level" to severity, or the values "error", "warning"... to 1, 2, ..., would be a breaking change, in a central portion of the spec. I wouldn't be willing to take that change now.

That leaves us with the question of "priority". Your paragraph 4 suggests that you are thinking of "level" as a "priority" value, but it's not. "level" measures "how bad is the bug?". "priority", if it existed, would mean "how important is it to fix the bug?". As I said earlier, they are two different concepts, and an analysis tool doesn't have the context to answer the "priority" question.

Let's continue to ponder whether "priority", or any other property that would come from post-processing, deserves to be a first-class citizen. I'm not sure I find the "order of post-processing tools" argument a compelling argument against it.