How to use scancode toolkit output files?

aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!

https://aboutcode.org/scancode/

2.13k stars 551 forks source link

How to use scancode toolkit output files? #3800

Open sankalpsp07 opened 5 months ago

sankalpsp07 commented 5 months ago

Hello Team,

how to use the output of the scancode toolkit? When will the scan fail? If my output file is 30 lakh lines, how will I consume it to find vulnerable licenses?

stefan6419846 commented 5 months ago

How the output will be used depends on your use case, parameters, toolchain etc. - there is no general post-processing which should be taken. Thus there is no general guidance on how to work with the detected licenses as well, as this usually depends on your personal requirements. Additionally, there is nothing like a "vulnerable license" - usage of some licenses might be discouraged by you or your organization or specific package versions might be vulnerable.

sankalpsp07 commented 5 months ago

If I want to integrate scancode toolkit with CI/CD Pipelines, on which condition, the pipeline would fail as the scancode toolkit result would fail?

stefan6419846 commented 5 months ago

AFAIK ScanCode-Toolkit only fails if there is an internal error. Other failures are subject to your own logic.

sankalpsp07 commented 5 months ago

Okay, What would be your logic to identify the licenses that are discouraged by organization?

stefan6419846 commented 5 months ago

This highly depends on your specific needs, how much you trust the output, how your CI works and so on. There still is no catch-all solution. One approach would be to retrieve the license fields from the generated file and check the identifiers against the list of allowed ones.

mjherzog commented 5 months ago

You can use the License Policy Plugin to apply your license policies to a scan - https://scancode-toolkit.readthedocs.io/en/latest/plugins/licence_policy_plugin.html.

pombredanne commented 4 months ago

What would be your logic to identify the licenses that are discouraged by organization?

This would be the policy feature alright https://github.com/nexB/scancode-toolkit/issues/3800#issuecomment-2158919252