Static code analysis - Githubissues

jminor commented 2 years ago

We aim to meet the OpenSSF Best Practices passing or higher badge level. One of the requirements is to run static code analysis on the project's source code.

See the "Analysis" section here: https://bestpractices.coreinfrastructure.org/en/projects/2288

Is there anyone on this project with expertise in this area?

The ASWF makes SonarQube available to us, and cppcheck (C, C++), clang static analyzer (C, C++) seem relevant. Is there a well known Python static analysis tool we could use to satisfy this?

Details from OpenSSF Best Practices:

At least one static code analysis tool (beyond compiler warnings and "safe" language modes) MUST be applied to any proposed major production release of the software before its release, if there is at least one FLOSS tool that implements this criterion in the selected language. [static_analysis] Hide details A static code analysis tool examines the software code (as source code, intermediate code, or executable) without executing it with specific inputs. For purposes of this criterion, compiler warnings and "safe" language modes do not count as static code analysis tools (these typically avoid deep analysis because speed is vital). Some static analysis tools focus on detecting generic defects, others focus on finding specific kinds of defects (such as vulnerabilities), and some do a combination. Examples of such static code analysis tools include cppcheck (C, C++), clang static analyzer (C, C++), SpotBugs (Java), FindBugs (Java) (including FindSecurityBugs), PMD (Java), Brakeman (Ruby on Rails), lintr (R), goodpractice (R), Coverity Quality Analyzer, SonarQube, Codacy, and HP Enterprise Fortify Static Code Analyzer. Larger lists of tools can be found in places such as the Wikipedia list of tools for static code analysis, OWASP information on static code analysis, NIST list of source code security analyzers, and Wheeler's list of static analysis tools. The SWAMP is a no-cost platform for assessing vulnerabilities in software using a variety of tools. If there are no FLOSS static analysis tools available for the implementation language(s) used, select 'N/A'.

JeanChristopheMorinPerso commented 2 years ago

SonarCloud does provide static analysis indeed. But it's not the only tool. There is also GitHub CodeQL which works with both Python and C++. CodeQL is focused on finding security issues. See https://codeql.github.com/docs/codeql-overview/about-codeql and https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/about-code-scanning.

For Python, we already run flake8 in CI, though flake8 is far from a complete static analysis tool. SonarCloud covers some stuff for Python too but don't know how much. But I'm also not a big fan of SonarCloud or cloud hosted services that are kind of a black box.

darbyjohnston commented 2 years ago

The OpenEXR SonarCloud report is interesting, even on a mature code base written by experienced developers it thinks it has found about 70 bugs, 40 security hotspots, and 9000 "code smells": https://sonarcloud.io/summary/overall?id=AcademySoftwareFoundation_openexr

Browsing through the bugs they definitely seem worthwhile, for example:

The expression is an uninitialized value. The computed value will also be garbage
Out of bound memory access (access exceeds upper limit of memory block)
1 uninitialized field at the end of the constructor call

The default settings for "code smells" is a bit strict though, many are useful but some can be a real chore to satisfy. Probably useful "code smells":

Value stored to 'h' during its initialization is never read
implicit conversion from 'int' to 'float' may lose precision

Maybe not so useful "code smells" that can cause a lot of refactoring work:

Extract this nested code block into a separate function
Refactor this code to not nest more than 3 if|for|do|while|switch statements
Refactor this function to reduce its Cognitive Complexity from 65 to the 25 allowed

meshula commented 2 years ago

On behalf of OpenEXR ;) I don't think we've reviewed these issues recently, but when things were first set up, we looked at the issues and errors, and as I recall a great many were false positives. It behooves us to look at it again, and perhaps make an effort close false positives, and address the rest.

I don't think the "smells" are particularly useful. 9000 smells when there are on the order of that many lines of code ;) Sonar Cloud really doesn't like the project! There might be some gems in there, but it's a bit much of a muchness, as Pooh would say.

darbyjohnston commented 2 years ago

I think the "smells" can be useful but there are way too many of them enabled by default. It would be nice if SonarCloud had a slider that let you adjust the "smell" level from "I like clean code" to "my code controls autonomous vehicles".

meshula commented 2 years ago

A lot of them seem to be someone's preferences, rather than established good practices. Like, "prefer auto" is an opinion, not a scientific fact, and one's coding standards may have nuanced rules that should supersede a robot's.

jminor commented 2 years ago

How tricky is the setup for these? If they are relatively easy to enable, maybe we can try them out and compare the results? We could use one of the smaller repos in the OpenTimelineIO org, or just one of our forks of OTIO to give them a test run.

jminor commented 2 years ago

We also have this LFX Security system available to us: https://security.lfx.linuxfoundation.org/#/a09410000182dD2AAI/foundation-details

@jmertic says:

Many of the ASFW projects are not doing full scans due to issue of not having the language or package manager files setup correctly. Refer to https://docs.snyk.io/products/snyk-open-source/language-and-package-manager-support to correct those issues so your project can benefit from the tool.

meshula commented 2 years ago

I requested access to the dashboard in order to add OpenTimelineIO.

JeanChristopheMorinPerso commented 2 years ago

C++ seems to be supported through the snyk CLI (https://docs.snyk.io/products/snyk-open-source/language-and-package-manager-support/snyk-for-c-c++).

I'm guessing that Snyk + SonarCloud + CodeQL would cover us pretty well security wise (for static things)?

jminor commented 2 years ago

Oh, I see OpenTimelineIO in the LFX Security tool already, but maybe it is restricted? It says something about "verified maintainer" when I log in. If so, we should get a few more maintainers access.

From the report inside there, we're either doing really well or it isn't scanning much. It only shows 1 "code secret" issue which is a false positive, and 1 "non-inclusive language" item, but no code-level issues.

We can look at this together next TSC meeting, or in a separate meeting if need be.

@JeanChristopheMorinPerso - yes, I think some combination of those would be great. Do you have a suggestion of which one we could start with?

jmertic commented 2 years ago

Yes - the details are available to maintainers only. You should be able to request access through the link that comes up if you don't have access and it will be granted.

JeanChristopheMorinPerso commented 2 years ago

@jminor CodeQL is kind of simple to setup. I actually already tried it a month ago, see https://github.com/JeanChristopheMorinPerso/OpenTimelineIO/actions/runs/3212525909/workflow. It didn't detect any issues though. SonarCloud would also be somewhat simple to integrate.

jmertic commented 1 year ago

Hey - any luck with LFX Security? We'd like to get projects standardized on that tool to make visibility more consistent across the ASWF as a whole.

JeanChristopheMorinPerso commented 1 year ago

I can look into integrating Snyk (with LFX Security) next week. And while being there I could add CodeQL.

AcademySoftwareFoundation / OpenTimelineIO

Static code analysis #1405