beetbox / audioread

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
MIT License
483 stars 108 forks source link

`audioread` reports `UNKNOWN` license instead of `MIT LICENSE` #137

Closed danieljanes closed 9 months ago

danieljanes commented 11 months ago

I'm one of the authors of Flower Datasets, an open-source library that allows users to partition datasets for federated learning. Flower Datasets builds on top of Hugging Face Datasets, and it installs audioread as a transitive dependency. We use licensecheck to ensure that only dependencies with certain types of licenses can be added to our repo.

When using licensecheck, audioread does not report the license (MIT), and thus fails the license check:


                                      List Of Packages
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Compatible ┃ Package           ┃ License(s)                                              ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ ✔          │ Pillow            │ HISTORICAL PERMISSION NOTICE AND DISCLAIMER (HPND)      │
│ ✔          │ PyYAML            │ MIT LICENSE                                             │
│ ✔          │ aiohttp           │ APACHE SOFTWARE LICENSE                                 │
│ ✖          │ audioread         │ UNKNOWN                                                 │
│ ✔          │ cffi              │ MIT LICENSE                                             │
│ ✔          │ datasets          │ APACHE SOFTWARE LICENSE                                 │
│ ✔          │ decorator         │ BSD LICENSE                                             │
│ ✔          │ dill              │ BSD LICENSE                                             │
│ ✔          │ fsspec            │ BSD LICENSE                                             │
│ ✔          │ huggingface-hub   │ APACHE SOFTWARE LICENSE                                 │
│ ✔          │ joblib            │ BSD LICENSE                                             │
│ ✔          │ lazy_loader       │ BSD LICENSE                                             │
│ ✔          │ librosa           │ ISC LICENSE (ISCL)                                      │
│ ✔          │ msgpack           │ APACHE SOFTWARE LICENSE                                 │
│ ✔          │ multiprocess      │ BSD LICENSE                                             │
│ ✔          │ numba             │ BSD LICENSE                                             │
│ ✔          │ numpy             │ BSD LICENSE                                             │
│ ✔          │ packaging         │ APACHE SOFTWARE LICENSE;; BSD LICENSE                   │
│ ✔          │ pandas            │ BSD LICENSE                                             │
│ ✔          │ pooch             │ BSD LICENSE                                             │
│ ✔          │ pyarrow           │ APACHE SOFTWARE LICENSE                                 │
│ ✔          │ requests          │ APACHE SOFTWARE LICENSE                                 │
│ ✔          │ scikit-learn      │ BSD LICENSE                                             │
│ ✔          │ scipy             │ BSD LICENSE                                             │
│ ✔          │ soundfile         │ BSD LICENSE                                             │
│ ✔          │ soxr              │ GNU LESSER GENERAL PUBLIC LICENSE V2 OR LATER (LGPLV2+) │
│ ✔          │ tqdm              │ MIT LICENSE;; MOZILLA PUBLIC LICENSE 2.0 (MPL 2.0)      │
│ ✔          │ typing_extensions │ PYTHON SOFTWARE FOUNDATION LICENSE                      │
│ ✔          │ xxhash            │ BSD LICENSE                                             │
└────────────┴───────────────────┴─────────────────────────────────────────────────────────┘
sampsyo commented 11 months ago

Interesting! Do you know where licensecheck gets its data from so we can report this in the way it wants? (We have a LICENSE file in the repository root, which is kinda the "de facto" standard that the GitHub UI respects, for example.)

danieljanes commented 11 months ago

Without looking into the licensecheck code, I would guess that license info gets pulled from thewhl (in .dist-info). Does your build process/tooling include the LICENSE in the whl?

sampsyo commented 11 months ago

I unfortunately don't know the answer to that either… maybe it would be a matter of including the file via the license key in Flit's package configuration? I would be glad to merge such a PR, if it indeed resolves the issue!

valtheval commented 2 months ago

Hi there, for information, this issue is still valid. The liccheck package when applied on a lib that has audioread as dependency returns this:

gathering licenses...
...
1 package.
    audioread (3.0.1): UNKNOWN