spotbugs / discuss

SpotBugs mailing list
6 stars 1 forks source link

Produce SpotBugs results in SARIF v2.1.0 format #95

Closed ghost closed 4 years ago

ghost commented 4 years ago

Now that SARIF v2.1.0 is an OASIS Standard, there's an internal Microsoft initiative to integrate SpotBugs into our SARIF-driven static analysis ecosystem. We'd like to do it in a way that benefits the entire community.

@michaelcfanning and I (@lgolding) are the co-editors of the SARIF specification, and we're happy to help in whatever way you think is most appropriate. We've integrated SARIF into open source tools such as Cake and ESLint, as well as Microsoft tools such as the C++, C#, and VB compilers.

There are various approaches to adding SARIF support:

In addition to enabling SpotBugs to participate in SARIF-driven ecosystems, SARIF output could alleviate certain issues in the SpotBugs XML output format:

KengoTODA commented 4 years ago

At this moment, the report generation process isn't pluggable. A workaround is that you generate your report from XML report that is parser-friendly.

To solve known issues, it's technically possible to make this process pluggable, but it's stateful now so it needs some time.

ghost commented 4 years ago

What do you think about the direct export option? We can continue to explore the XML->SARIF option but it has some complications (described above). Also, by the way, GitHub's automatic code scanning will accept direct export from tools that produce SARIF 2.1.0, so there would be a tangible benefit to SpotBugs in supporting it.

KengoTODA commented 4 years ago

It sounds awesome. And it should be technically possible to implement the direct export option.

I'm not sure about this format, and probably have no time to handle, so I hope that other contributors will have a try.

uhafner commented 4 years ago

This would be really helpful to have a different format since parsing of the current format is a complicate task. (And using the internal parser is also not very elegant since the whole SpotBugs library is required as a dependency).

On the other hand, is there a Java parser for SARIF already available? In my role as author of the Jenkins warnings plugin I am a consumer of the current SpotBugs format. So exporting the bugs to a new format is only half the way, we then need also the way back from the XML file to the object model of the bugs. Is there anything planned here on the SARIF side?

ghost commented 4 years ago

We currently have .NET and TypeScript language bindings for the SARIF object model. We don't yet have a Java binding. The .NET binding is actually generated programatically from the SARIF JSON schema. If there is a JSON-schema-to-Java-OM utility around, you would get the bindings for free. Do you know of one? I didn't find one in a quick search just now.

ghost commented 4 years ago

@uhafner, by the way, if your Jenkins warning plugin consumed SARIF, you would automatically have support for any tool that produces that standard format. This sounds like another great opportunity. We can talk about it over on your repo if you'd like.

@michaelcfanning FYI.

michaelcfanning commented 4 years ago

I believe that @lcartey of CodeQL has produced a Java OM from the SARIF schema.

uhafner commented 4 years ago

I see. The schema looks quite complex. The tools that normally show up in the warnings plugin typically produce a list of warnings with a couple of properties only. Are there a lot of tools using SARIF already?

michaelcfanning commented 4 years ago

It's a complex format, intended to cover the range of static analysis tools out there. You might find the SARIF Tutorial a little friendlier for ramping up. We have a well-developed C# SARIF-SDK to facilitate read/write, other scenarios but that isn't helpful for Java, of course.

Internally at Microsoft, every tool that's run as part of security/other policy has SARIF support. Every tool owned internally or for which a Microsoft engineer serves as open source coordinator (such as BinSkim exports the format directly, Both of Microsoft's publicly shipping analysis platforms (PREfast and Roslyn) have direct support.

Externally, there's support for ESLint, CLang analyzer (built-in). GrammaTech supports the format, as does Semmle/CodeQL and MicroFocus Fortify is working on it (I believe). We have open source converters for Fortify and Contrast Security.

The discussion was actually prompted by Microsoft utilization of SpotBugs as Larry mentioned. All our engineering systems for producing results, filing work items, etc., are driven by the format, so we're looking for some sort of SARIF solution so that we can get SpotBugs plugged in.

@lgolding and I are both happy to help advise on SARIF support. If it's helpful, we could attempt a contribution for this. If direct support in SpotBugs looks too intimidating to take on, I think Microsoft will fund authoring an open source converter from SpotBugs XML.

KengoTODA commented 4 years ago

note: Following files are key factors to generate spotbugs report. Not sure that they provide enough feature to generate in the SARIF format or not.

KengoTODA commented 4 years ago

I'll check sarif-tutorials later. Thanks for your share!

KengoTODA commented 4 years ago

I'm working on this issue. I cannot find a good way to generate Java binding from JSON schema, so writing it on the top of the org.json:json library. https://github.com/spotbugs/spotbugs/compare/sarif-report

Still not sure that we can solve the problem in XML format.

ghost commented 4 years ago

I'm glad you're working on this! I'm happy to video conference with you to discuss the details of how to map your internal bug representation to SARIF. We can start simple, and then there are many ways to produce SARIF output that's effective for end users. In fact I'm writing a document about that now, which I'll share with you very soon.

KengoTODA commented 4 years ago

Thank you, Current my idea is that:

And here is known issues:

I will list more my doubt/questions later.

lcartey commented 4 years ago

@KengoTODA Sorry for not responding sooner, but I have had some success with:

http://www.jsonschema2pojo.org/

For generating a Java object model for SARIF. They also provide Maven/Gradle/Ant plugins for automating the process.

KengoTODA commented 4 years ago

SpotBugs 4.1.0 provides an experimental support for the SARIF 2.1.0. The generated report can pass the latest SARIF validator.

It should have known and unknown issues (e.g. https://github.com/spotbugs/spotbugs/pull/1221), please feel free to issue at https://github.com/spotbugs/spotbugs/issues

yongyan-gh commented 3 years ago

@KengoTODA Sorry for not responding sooner, but I have had some success with:

http://www.jsonschema2pojo.org/

For generating a Java object model for SARIF. They also provide Maven/Gradle/Ant plugins for automating the process.

Hi @lcartey , have you ever successfully generated POJO classes from Sarif Json schema. KengoTODA faced some issue in generating AdditionalProperties. The jsonschema2pojo may not have a full coverage of Json schema specs.