pmonks / lice-comb

A Clojure library for software license detection.
Apache License 2.0
7 stars 0 forks source link

Process SPDX files, if present #4

Open pmonks opened 2 years ago

pmonks commented 2 years ago

Job Story

When a dependency's artifacts include SPDX documents, I want tools-licenses to process those documents and extract license information from them, so that I can be sure the most accurate and comprehensive license information available is being reported.

Potential Solutions:

Though it mostly seems to be overkill, Spdx-Java-Library may be useful here, at least for SPDX document parsing.

pmonks commented 2 years ago

When an SPDX document is found, only it should be used for license detection - the other mechanisms (pom file processing, finding probable license files and trying to identify the license(s) in them) should not be invoked.

pmonks commented 8 months ago

Potential implementation notes:

  1. Detect the MIME type of a file (since SPDX supports multiple formats for SPDX documents - tag/value, JSON, and RDF). This library looks good for this.
  2. Based on the MIME type, use one of the format-specific SPDX "Model Store" libraries to load it:
    1. Tag/value
    2. JSON
    3. RDF
  3. Here's some example code showing how reading is done

The IModelStore abstraction in Spdx-Java-Library is stateful and pretty non-idiomatic for Clojure (it would be preferable to just return data structures, and let the caller decide what to do with them), so it would be ideal to find a way to unload files from memory after they've been read and returned.