google / licensecheck

The licensecheck package classifies license files and heuristically determines how well they correspond to known open source licenses.
BSD 3-Clause "New" or "Revised" License
452 stars 75 forks source link

EUPL type doesn't match #66

Open daenney opened 8 months ago

daenney commented 8 months ago

Despite the presence of the license file in https://github.com/google/licensecheck/blob/0279a51dbb8a386077c68c8eb18ff2850dfbc12e/licenses/EUPL-1.2.lre#L1, the license is not matched by licensecheck.

package main

import (
    "fmt"
    "os"

    "github.com/google/licensecheck"
)

func main() {
    data, _ := os.ReadFile("EUPL-1.2%20EN.txt")
    cov := licensecheck.Scan(data)
    fmt.Printf("%.1f%% of text covered by licenses:\n", cov.Percent)
    for _, m := range cov.Match {
        fmt.Printf("%s at [%d:%d] IsURL=%v\n", m.Type, m.Start, m.End, m.IsURL)
    }
}
$ curl -LO https://joinup.ec.europa.eu/sites/default/files/custom-page/attachment/2020-03/EUPL-1.2%20EN.txt

$ go run main.go
100.0% of text covered by licenses:
Unknown at [0:13827] IsURL=false
daenney commented 8 months ago

Ah, sorry, I needed .ID to get the name, in which case it's recognised.

However, based on the comments in https://github.com/google/licensecheck/blob/0279a51dbb8a386077c68c8eb18ff2850dfbc12e/license.go#L146-L149 it should have a ShareServer type, not an Unknown type.