terrastruct / d2

D2 is a modern diagram scripting language that turns text to diagrams.
https://d2lang.com
Mozilla Public License 2.0
16.47k stars 411 forks source link

Integrating with google/oss-fuzz for continuous fuzz-testing #1664

Open silvergasp opened 10 months ago

silvergasp commented 10 months ago

Hey Terrastruct Team,

I hope this message finds you well. I've been closely watching the development of d2 and I'm genuinely impressed by its capabilities and potential. I'm reaching out with an idea to further enhance the project's reliability. Given the importance of d2 in diagramming and structuring, I'd like to suggest setting up some basic fuzz-testing and combine it with google/oss-fuzz for continuous fuzzing. Recognizing that maintaining a project like d2 is demanding, I want to ensure I'm not adding undue overhead. Is this an inconvenient moment to broach the subject of potential security/reliability enhancements?

If you're unfamiliar with fuzzing or oss-fuzz, here are some brief insights:

Benefits of Fuzz-Testing

Google/oss-fuzz for Continuous Fuzzing

For a deeper dive into security and best practices, OpenSSF is a fantastic resource.

I'm eager to drive the initiative to integrate fuzz testing with d2, providing support wherever necessary.

As a starting point, I've put together a basic fuzz harness for a key component of the d2 project in #1663

Looking forward to hearing your thoughts!

alixander commented 10 months ago

Hi @silvergasp thank you for the kind words and taking the initiative on this.

Although more tests are welcome, there's a couple things which make the current PR not the right shape.

  1. Fuzzing only works on fast targets (< 1 second): https://go.dev/security/fuzz/#suggestions . Parts of D2 are fast enough, like the compiler, but other parts require more work, like measuring text spacing for renders. The end-to-end execution is unsuitable for fuzzing.
  2. We have something akin to fuzzing, which is our chaos tests: https://github.com/terrastruct/d2/tree/master/d2chaos . These are run daily on CI, and are randomized generations of diagrams. Not exactly fuzzing, but the more domain-specific testing exercises things we care about more: are there DSL scripts which will crash D2.

I'm not sure how much value there is in fuzzing command line arguments. However, improvements or expansions of existing d2chaos testing framework are most welcome.

Regarding Google/oss-fuzz, I hadn't heard of it but it looks potentially useful. Will make a note to revisit during some dedicated learning time. Thanks again @silvergasp

silvergasp commented 10 months ago

Fuzzing only works on fast targets (< 1 second): https://go.dev/security/fuzz/#suggestions . Parts of D2 are fast enough, like the compiler, but other parts require more work, like measuring text spacing for renders. The end-to-end execution is unsuitable for fuzzing.

It seems like you're quite familiar with fuzzing and the common pitfalls. I will push back on this a little though. While as you say the upper bound for an e2e fuzz test like this is >1s per execution. I'm getting on average around 40 executions per second (which is the inverse of the other statement). So while it's less than ideal (typically >1000 executions/second would be considered good). The fuzzer as it currently stands in my opinion is just fast enough to be useful. Although it's a valid point that this should be tightened up significantly, and I'm happy to just drop the render stage leaving the compilation stage if you think that would be best?

  1. We have something akin to fuzzing, which is our chaos tests: https://github.com/terrastruct/d2/tree/master/d2chaos . These are run daily on CI, and are randomized generations of diagrams. Not exactly fuzzing, but the more domain-specific testing exercises things we care about more: are there DSL scripts which will crash D2.

I'm not sure how much value there is in fuzzing command line arguments. However, improvements or expansions of existing d2chaos testing framework are most welcome.

Cool I'll have a look at that in a bit more detail. At a first glance this effectively looks like open-loop fuzzing. This differs slightly compared to the default fuzzer that ships with golang, which is a closed-loop coverage driven fuzzer. So both methods of fuzzing will start off roughly the same. Both fuzzer's will produce a set of random inputs to pass to the software under test. Where a closed-loop fuzzer differs, is it will collect the code coverage for each individual input in the set of inputs. The inputs that produce the highest code coverage will then be mutated to create new inputs. The result of this is that the fuzzer will "learn" and stochastically choose better inputs to maximize code-coverage.

It looks like d2chaos might be a better starting point, in that I could port over the existing d2chaos system to use go's native coverage-driven fuzzer rather than the "rand" module. This way I could leverage all the existing fuzzing capabilities as well coverage-driven-feedback, from the native go fuzzer. How does this sound?

Regarding Google/oss-fuzz, I hadn't heard of it but it looks potentially useful. Will make a note to revisit during some dedicated learning time.

No worries, I've done a few integrations with oss-fuzz so I'm quite familiar with the process, in fact I used a local version of oss-fuzz to test this library. So to integrate it with oss-fuzz all I'd need to do is push some code-changes and create a PR. Happy to answer any questions regarding this if you are curious :)