aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!
https://github.com/aboutcode-org/scancode-toolkit/releases/
2.07k stars 536 forks source link

Add support for nix packages #2823

Open pombredanne opened 2 years ago

pombredanne commented 2 years ago

We should add support for https://nixos.org/ packages An example of a nix "derivation" is at https://github.com/nix-community/nix-direnv/blob/master/flake.nix This is like a package manifest. For instance, this is one of the packages of ScanCode in nix: https://github.com/NixOS/nixpkgs/blob/nixos-21.11/pkgs/development/python-modules/pyahocorasick/default.nix#L29

Of note: it uses its own "language" for license declaration. We may need a simple pygmars-based nix parser. Check also https://framagit.org/upt/upt-nix/-/blob/master/upt_nix/upt_nix.py

Some other pointers include:

adityasangave commented 2 years ago

Nixos has its own packages and we need to add support for scanning those packages am I right? I would like to work to work on these as I am familiar with package scanning in scancode can you please tell me where should I be starting

pombredanne commented 2 years ago

@adii21-Ux yes!... so the first step would be to document which files and index exist in a nix package that could be used to scan and collect details. You should consider:

There may be subtle differences between nix and nixos there.

Once this is done, please document this either here or in a PR as comment in the to be nix.py so we can discuss it

The next steps will be to assemble a few different examples of the key file formats, starting with the most likely to be found in the wild, likely B. as .nix files and use these as test files (to be committed)

Then equipped with these test files, you can start crafting a few basic classes. Likely starting with function to determine what is a file is nix manifest. I suggest you hook with @AyanSinhaMahapatra for this. And you will likely need to recognize and parser .nix. @JonoYang is the master to ask for help.

Or course, you want to craft tests as you go.

Note that the models for data packages are going through major surgeries as we speak so it will be useful to create small PR and rebasing is likely to be required frequently.