Open michaelcfanning opened 4 years ago
Biggest problem is that our descriptor is written in HTML like this so we cannot provide it in TEXT and MD.
It's also nice if we can generate necessary classes/interfaces from JSON schema. But it's not mandatory; we can code classes manually.
Biggest problem is that our descriptor is written in HTML like this so we cannot provide it in TEXT and MD.
I remember that markdown can contain HTML elements, I'll directly put our descriptors into text.markdown
property as an experiment.
You can find the .sarif
report generated by the latest SpotBugs (not released yet) at here:
https://github.com/spotbugs/spotbugs/pull/1184#issuecomment-652959161
When you find problems, please come https://github.com/spotbugs/spotbugs/pull/1185 to comment, or mention me at here. Thanks!
I remember that markdown can contain HTML elements, I'll directly put our descriptors into text.markdown property as an experiment.
I want to confirm this really work or not. However, even when I put fullDescription.markdown
property, the SARIF viewer plugin for VS Code doesn't display it. How I can verify that the current JSON file is valid or not?
Here is viewers' screenshot from my local:
Here is the JSON data in tool.driver.rules
:
{"messageStrings":{"default":{"text":"{0} is Serializable; consider declaring a serialVersionUID"}},"id":"SE_NO_SERIALVERSIONID","shortDescription":{"text":"Class is Serializable, but doesn't define serialVersionUID"},"fullDescription":{"markdown":"\n\n <p> This class implements the <code>Serializable<\/code> interface, but does\n not define a <code>serialVersionUID<\/code> field. \n A change as simple as adding a reference to a .class object\n will add synthetic fields to the class,\n which will unfortunately change the implicit\n serialVersionUID (e.g., adding a reference to <code>String.class<\/code>\n will generate a static field <code>class$java$lang$String<\/code>).\n Also, different source code to bytecode compilers may use different\n naming conventions for synthetic variables generated for\n references to class objects or inner classes.\n To ensure interoperability of Serializable across versions,\n consider adding an explicit serialVersionUID.<\/p>\n\n "},"helpUri":"https://spotbugs.readthedocs.io/en/latest/bugDescriptions.html#SE_NO_SERIALVERSIONID","properties":{"tags":["BAD_PRACTICE"]}}
Or you may check the while main.sarif.txt. Thanks in advance!
What is the difference between tool.driver.notifications
and invocations.toolConfigurationNotifications
?
I'm using driver.notifications
to put errors and exceptions like below, is it OK?
{
"version":"2.1.0",
"$schema":"https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0-rtm.4.json",
"runs":[
{
"tool":{
"driver":{
"name":"SpotBugs",
"version":"Development",
"language":"en",
"notifications":[
{
"level":"error",
"descriptor":{
"id":"spotbugs-missing-classes"
},
"message":{
"text":"Classes needed for analysis were missing: [com.github.spotbugs.MissingClass]"
}
}
]
}
}
}
]
}
tool.driver.notifications
are a set of reporting descriptors that provide metadata on the notifications a component might raise.
invocations.toolConfigurationNotifications
and invocations.toolExecutionNotifications
are instances of those notifications that actually occurred at runtime. We call the analysis equivalents rules
and results
. It's confusing here because we use the same term for notification metadata as well as the notifications.
toolConfigurationNotifications
are errors that resolve to some sort of configuration problem. That is, they reflect a problem that can be resolved by the user properly configuring the tool (in your example, by ensuring that all classes relevant to analysis are present. toolExecutionNotifications
are errors in the engine itself (i.e., bugs that need to be addressed by the tool developer). You'd use this property to store an unhandled exception or other engine correctness problem.
OK thanks. I'm still not sure when I need to use tool.driver.notifications
, I know that toolConfigurationNotifications
is better place to put information about missing classes.
When you want to report an actual notification that occurred to a user, such as when classes are missing, use toolConfigurationNotifications
to send those messages. tool.driver.notifications
is a place to add metadata about each class of notification, for example, this is where you would store a full description of the notification.
@lgolding, you are an effective explainer of the standard, want to jump in here? :)
@KengoTODA, @michaelcfanning is correct (except that, looking at your file, I see that you have toolExecutionNotifications
, not toolConfigurationNotifications
.)
The SARIF Tutorials has a section on Notifications that might help. Here is an example (which I will add to the tutorial):
{
"version": "2.1.0",
"runs": [
{ // A run object.
"tool": {
...
},
"results": [
...
],
"invocations": [
{ // An invocation object
"toolExecutionNotifications": [
{ // A notification object.
"level": "error",
"descriptor": {
"id": "spotbugs-missing-classes"
},
"message": {
"text": "Classes needed for analysis were missing: [apply, iterator, ...]"
}
},
]
"toolConfigurationNotifications": [
{ // A notification object.
...
},
]
}
]
}
]
}
Now, the SARIF toolComponent
object does have a notification
property, but it doesn't hold notifications: it holds descriptions of (metadata about) notifications. So if, for example, you wanted to provide a full description of what the notification with id NOT10001
meant, you could put it in tool.driver.notifications
.
The property tool.driver.notifications
used to be called ...notificationDescriptors
. Towards the end of the standardization process, we went through an exercise of shortening many of the property names. In this case, it seems that the shortened name caused confusion.
@KengoTODA I added a Notifications sample to the SARIF Tutorials. I hope you find it helpful.
I'm going through your sample SARIF output in detail and will provide more feedback later today. Thank you for doing this!
Thanks. This is the latest example spotbugs.sarif, that was made for this PR. At least it passed the SARIF validator.
I will check your comments carefully, to try to understand which property I should use for this case.
Kengo, thank you for the latest spotbugs.sarif. It saved me a lot of work :-)
This file is almost valid according to the schema. The only error is that the location.logicalLocation
property is actually named logicalLocations
(plural). Note that its value is an array.
The spec (§3.28.4) explains the reason for this:
NOTE:
logicalLocations
is an array because some logical locations can be expressed in more than one way. For example, the logical location of an element in an HTML document might be expressed by an XML Path expression such as/html/body/img[1]
or by a CSS selector such as#logo
.
After I fixed that, the latest version of the SARIF validator tool (which we will publish very soon) finds only two problems:
The $schema
property should have the value https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0-rtm.5.json
, which is the final official version. The only difference between -rtm.4
and -rtm.5
was to correct a typo in the name of one property that you are not using.
Rule messages should be expressed in complete sentences, so they should end with a period (for example, "{0} may expose internal representation by returning {1}."
I will now look by hand to see if I can make any more recommendations.
Hi @KengoTODA, here are some more comments on the SpotBugs SARIF output.
NOTE: Many of the things that I mention here, we will enhance our validator to report automatically.
Do you plan to include codeFlows
in your output? spotbugs/discuss#69 mentioned potential improvements to SpotBugs output related to code flows, and SARIF's code flow feature would make it easier to do.
In your previous sample output, you included tool notifications, including exception
objects. That was very helpful. Do you plan to put them back?
Your logicalLocations.fullyQualifiedName
(for example, edu/umd/cs/findbugs/workflow/RejarClassesForAnalysis.java:[line 220]
) is actually a physical location: a file name plus a line number. You already have that information in physicalLocation
. The spec defines "logical location" like this:
logical location
location specified by reference to a programmatic construct, without specifying the artifact within which that construct occurs
Example: A class name, a method name, a namespace.
The idea of logicalLocation.fullyQualifiedName
is to have (for example) a namespace + a class + a method name, like System.Collections.Generic.Dictionary.Keys
. In that case, you might use Keys
as the name
property, and the whole string as the fullyQualifiedName
property.
The spec describes a rule id as a "stable", "opaque" identifier (see §3.49.3):
In the case of a rule,
id
SHALL contain a stable identifier for the rule and SHOULD be opaque.
By "opaque" the spec means a purely symbolic name, like SPOT1001
. We suggest you assign ids such as these to your rules, and place your existing human-readable rule names in the name
property:
"tool": {
"driver": {
...
"rules": [
{
"id": "SPOT1001",
"name": "EI_EXPOSE_REP",
...
}
]
}
}
The property names in originalUriBaseIds
are numbers. The spec (§3.3.4) does say that this is legal:
The
uriBaseId
property can be any string; it does not need to have any particular syntax or follow any particular naming convention.
However, it is helpful to use a symbolic name that is meaningful to the reader. In examples, we frequently use names like SOURCE_ROOT
, BIN_ROOT
, or REPO_ROOT
.
Since tool.driver.notifications
is an empty array, you can omit it.
In a region
object, if endLine
is the same as startLine
, you don't have to mention endLine
(it defaults to startLine
).
We recommend enclosing "dynamic content" (the placeholders like {0}
in your message strings) in single quotes. The quotes make the variable content easier to spot, and we've found single quotes give a nice clean appearance. To be clear, the spec does not require this, but we did add a rule to our SARIF validator tool to recommend it.
@KengoTODA, I want to say a little more about SARIF "opaque" rule ids. When the spec says that a rule id must be "opaque", it does not mean that it must be incomprehensible to human readers, like "SPOT1001". It just means that a generic SARIF consumer -- that is, a consumer with no special knowledge of the tool -- is not allowed to assume anything about the internal structure of the identifier. For example, a generic SARIF consumer is not allowed to assume that because the rule id starts with "EI_", it has to do with "exposing internal representation" (as I see from the documentation of EI_EXPOSE_REP, EI_EXPOSE_REP2, etc.)
SARIF opaque rule identifiers should be short (to take up less UI space) and search-engine-friendly (to make it easy for users to learn more about the rule). Your strings like EI_EXPOSE_REP are fine in those regards. I should not have encouraged you to invent additional "purely symbolic" identifiers like "SPOT1001".
If your rules have additional, human-readable names, you could put them in the rule's name
property, for example:
"rules": [
{
"id": "EI_EXPOSE_REP",
"name": "ExposesInternalRepresentation",
@michaelcfanning FYI
@KengoTODA, we have now published the latest SARIF validator, which you can use to assess your SARIF output. To use the validator:
Install the .NET Core SDK 3.1: https://dotnet.microsoft.com/download/dotnet-core/3.1
This adds the .NET Core command line tool dotnet
to your path.
In a new terminal window, run the command:
dotnet tool install --global Sarif.Multitool
This adds the SARIF Multitool command line tool sarif
to your path.
In a new terminal window, run the validate
command on your SpotBugs SARIF output:
sarif validate SpotBugs.sarif
This produces a list of results on the console. You can save those results as a SARIF log file:
sarif validate SpotBugs.sarif --output SpotBugs-validation.sarif
The SARIF validator checks for:
We continue to add new rules to the validator and to refine existing rules. To get the latest version:
dotnet tool update --global Sarif.Multitool
Hope this helps!
Thanks, I've tried it. It seems that it needs not SDK 3.1 but SDK 2.1, so I installed it from https://dotnet.microsoft.com/download/dotnet-core/2.1
Here is the log which was reported when I run sarif
tool with SDK 3.1:
$ sarif validate Desktop/spotbugs.sarif
It was not possible to find any compatible framework version
The framework 'Microsoft.NETCore.App', version '2.1.0' was not found.
- The following frameworks were found:
3.1.6 at [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]
You can resolve the problem by installing the specified framework and/or SDK.
The specified framework can be found at:
- https://aka.ms/dotnet-core-applaunch?framework=Microsoft.NETCore.App&framework_version=2.1.0&arch=x64&rid=osx.10.15-x64
Sorry for my lazy reply, I'll check your comments one by one.
Do you plan to include codeFlows in your output? spotbugs/discuss#69 mentioned potential improvements to SpotBugs output related to code flows, and SARIF's code flow feature would make it easier to do.
Issued as https://github.com/spotbugs/spotbugs/issues/1241
In your previous sample output, you included tool notifications, including exception objects. That was very helpful. Do you plan to put them back?
It should exist even in the current version (SpotBugs 4.1.1). I need to improve the sample output to cover features in the format.
Trying to generate POJO from JSON schema. It marks fields in MessageStrings
object 'additional properties' so the generated JSON file keeps nothing. So we cannot keep the message template with placeholder explained as 3.49.11. Here is generated Java code:
https://gist.github.com/KengoTODA/357e483984c8ed5442da7ed4b7331907
Still not sure where this problem comes from.
@michaelcfanning , i think we can close this, since we were able to solve the issues in the sarif version from spotbugs. What do you think?
@eddynaka in my case, in which project I should make an issue? is it https://github.com/oasis-tcs/sarif-spec/ ?
@KengoTODA I thought this was the issue that we closed in spotbugs
e.g., can we help provide a schema-driven Java OM?
https://github.com/spotbugs/discuss/issues/95