nexB / scancode-analyzer

scancode-results-analyzer
4 stars 2 forks source link

Create new LicenseMatch class from scancode data #52

Closed AyanSinhaMahapatra closed 3 years ago

AyanSinhaMahapatra commented 3 years ago

Create a new LicenseMatch class such that:

  1. Only attributes that are needed for analysis are kept in the class.
  2. When leading data into this class, only keep unique matches. (As in case of compound license expressions, only keep one match)

This would be beneficial in the following ways:

  1. We don't have to deal with serialized data from scancode anymore.
  2. Even when scancode license data formats change, we only have to change the loading function.
  3. In license expression prediction, we count only the unique matches, this would make license expression prediction more accurate.