Dynamic code analysis - Githubissues

jminor commented 2 years ago

We aim to meet the OpenSSF Best Practices passing or higher badge level. One of the requirements is to run dynamic code analysis on the project's source code.

See the "Analysis" section here: https://bestpractices.coreinfrastructure.org/en/projects/2288

Is there anyone on this project with expertise in this area?

The ASWF makes SonarQube available to us, and cppcheck (C, C++), clang static analyzer (C, C++) seem relevant. Is there a well known Python static analysis tool we could use to satisfy this?

Details from OpenSSF Best Practices:

It is SUGGESTED that at least one dynamic analysis tool be applied to any proposed major production release of the software before its release. [dynamic_analysis] Hide details A dynamic analysis tool examines the software by executing it with specific inputs. For example, the project MAY use a fuzzing tool (e.g., American Fuzzy Lop) or a web application scanner (e.g., OWASP ZAP or w3af). In some cases the OSS-Fuzz project may be willing to apply fuzz testing to your project. For purposes of this criterion the dynamic analysis tool needs to vary the inputs in some way to look for various kinds of problems or be an automated test suite with at least 80% branch coverage. The Wikipedia page on dynamic analysis and the OWASP page on fuzzing identify some dynamic analysis tools. The analysis tool(s) MAY be focused on looking for security vulnerabilities, but this is not required.

JeanChristopheMorinPerso commented 2 years ago

For python, there is https://hypothesis.readthedocs.io/en/latest/ for fuzzing.

Also, see https://github.com/AcademySoftwareFoundation/OpenTimelineIO/issues/1407#issuecomment-1254239812. OSS-fuzz could potentially be used, but I don't know if it would qualify. But even if we don't qualify, they still have https://google.github.io/clusterfuzzlite/ which can be run in CI (and supports both hypothesis and libfuzzer)

darbyjohnston commented 2 years ago

It might be nice to run the C++ tests with valgrind: https://valgrind.org/

It's great at finding memory issues though it can also be quite slow and could impact the CI build times.

JeanChristopheMorinPerso commented 2 years ago

Yeah, Valgrind would be nice! It can even be used with Python natively since Python 3.6 (using the PYTHONMALLOC=malloc environment variable). See https://github.com/pybind/pybind11/pull/2746 for how it's done in Pybind11.

darbyjohnston commented 2 years ago

Nice, I didn't know about the Python support.

I tried a quick valgrind test on test_serializableCollection.cpp and the results look good, no memory leaks or bad accesses:

$ valgrind tests/test_serializableCollection 
==7933== Memcheck, a memory error detector
==7933== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7933== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==7933== Command: tests/test_serializableCollection
==7933== 
Running test test_children_if
Running test test_children_if_search_range
Running test test_children_if_shallow_search
==7933== 
==7933== HEAP SUMMARY:
==7933==     in use at exit: 0 bytes in 0 blocks
==7933==   total heap usage: 79 allocs, 79 frees, 84,518 bytes allocated
==7933== 
==7933== All heap blocks were freed -- no leaks are possible
==7933== 
==7933== For lists of detected and suppressed errors, rerun with: -s
==7933== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

meshula commented 1 year ago

There is an interesting PR on OpenEXR today. OpenEXR already has OSSFuzz running, this PR adds a github action to run the fuzzing. https://github.com/AcademySoftwareFoundation/openexr/pull/1317

As submitted, it runs the fuzz test for 5 minutes. I'm trying to understand whether that's enough to add the full value of catching fuzz issues early. It might be the case that OpenEXR might need to create a lightweight fuzzer, that exercises all the categories of fuzzing briefly as a smoke test, rather than running the time consuming long tests. (As it stands the long running tests do run automatically, with reporting via a dashboard and email notifications.)

I do so the attraction of getting an early indication that there are fuzz issues at PR time, rather than at the multi-day cadence of OSSFuzz.

JeanChristopheMorinPerso commented 1 year ago

Nice! I think (?) it's the se thing as what I mentioned in https://github.com/AcademySoftwareFoundation/OpenTimelineIO/issues/1406#issuecomment-1254249664.

meshula commented 1 year ago

It's related - it uses https://github.com/google/oss-fuzz/ ~ ossfuzz supports clusterfuzzlite as one of the fuzzing technologies it can wrangle.

AcademySoftwareFoundation / OpenTimelineIO

Dynamic code analysis #1406