intel / cve-bin-tool

The CVE Binary Tool helps you determine if your system includes known vulnerabilities. You can scan binaries for over 200 common, vulnerable components (openssl, libpng, libxml2, expat and others), or if you know the components used, you can get a list of known vulnerabilities associated with an SBOM or a list of components and versions.
https://cve-bin-tool.readthedocs.io/en/latest/
GNU General Public License v3.0
1.18k stars 454 forks source link

feat: Create fuzzer for Python package parser #3331

Closed terriko closed 7 months ago

terriko commented 12 months ago

Description

cve-bin-tool has an existing fuzz testing setup which is based on Google Atheris. One of the areas it doesn't yet cover is the files used by the language list parsers. These are typically lists of 3rd party components/requirements written in a format to a specific packaging tool for a specific programming language. These may be lists of requirements generated by a human, or they could be generated by a tool.

This particular request is to fuzz the PythonParser which handles PKG-INFO: and METADATA: inside python packages, but I'll be filing requests for the other parsers as well. You can see which ones are listed under the security tag.

Note that I'm filing this separately from the PythonRequirementsParser just so two different people could work on them, or one person could get two hacktoberfest commits.

Why?

Regular fuzz testing can help us find bugs and potential security issues in parsing . While we hope users aren't going to be regularly scanning malicious python METADATA and PKG-INFO we'd still like to be able to handle things correctly if a file is really malformed.

How should I do this?

  1. Set up your own environment for fuzzing cve-bin-tool using Atheris. We recommend you use a container or vm for this for safety (a misconfigured fuzzer can potentially make a big mess).
  2. Be aware that Atheris and its requirements can be a bit finicky to set up and last time we ran a big fuzzing campaign only some versions of Python in some environments actually worked easily. If you find any issues with following the setup docs, or manage to find good workarounds for an environment we haven't mentioned, please file issues or make a PR to add them to our docs.
  3. Create a new proto file (or files) to generate fuzzed METADATA and PKG-INFO and add them to our proto files directory: https://github.com/intel/cve-bin-tool/tree/main/fuzz/proto_files. It's ok to have tests against files that are completely garbage, but probably the most interesting bugs will come from files that mostly look correct, and the proto setup will help you do that. If you're not sure how any of this works, you may find it useful to read this primer on structure-aware fuzzing
  4. Make a python file to call your fuzzer. Here's what the cyclonedx fuzzer looks like, as an example. Yours may be considerably different -- feel free to search for other examples and read the Atheris/libfuzzer/protobuf-mutator docs to help you figure out what you need.

Hacktoberfest

I'm filing this with the intention of it being a bug for hacktoberfest 2023. If you're intending to do it as part of that contest, make sure you follow their rules. I believe we have to accept/merge your PR between Oct 1-31 for it to count, and you'll need to register after September 28 but probably before we merge anything. You may be able to open a draft PR earlier. Do let me know if you need something to count for hacktoberfest.

New Contributor Tips

Since this is marked as a hacktoberfest issue there's a good chance whoever does it will be new to cve-bin-tool, so here's the tips we usually put on new contributor friendly bugs

Short tips for new contributors:

Claiming issues:

prady0t commented 11 months ago

Hey I'm really interested to work on this. Should I make a draft PR now or wait for the Hactoberfest event to begin?

terriko commented 11 months ago

@prady0t I couldn't really tell if the Hacktoberfest folk had opinions about drafts starting before the contest does so you might want to check with them. My guess is their script only checks when the code is approved or merged, so it probably doesn't matter if you have a draft open early.

Parvezkhan0 commented 11 months ago

I want to work on this issue. Please assign it to me

terriko commented 11 months ago

@Parvezkhan0 I think @prady0t is already working on this one, but there's a number of similar bugs open that don't have takers yet (e.g. https://github.com/intel/cve-bin-tool/issues/3334 or https://github.com/intel/cve-bin-tool/issues/3332) , so maybe try one of those?

prady0t commented 11 months ago

Hey sorry for not updating. I was having some troubles setting up the project I'm still working on it. @Parvezkhan0 I will let you know if I decide to move onto a different issue so you can pick this up. Thanks for your patience. 😄

inosmeet commented 8 months ago

Hey @terriko!, can I try it as it seems pretty inactive?

joydeep049 commented 7 months ago

Hello @prady0t , Are you still working on this?

prady0t commented 7 months ago

No go ahead