aboutcode-org / python-inspector

Inspect Python code and PyPI package manifests. Resolve Python dependencies.
22 stars 19 forks source link

Package list contains duplicate entries #71

Closed mnonnenmacher closed 2 years ago

mnonnenmacher commented 2 years ago

The package list in the output of python-inspector 0.7.0 contains duplicate entries which are identical except for the download url (one for the .whl, one for .tar.gz). To reproduce run python-inspector -p 27 -s setup.py --json - on this file. To reduce the file size and simplify parsing it would be beneficial if the package entry would contain a list of download urls with associated hashes instead. Maybe this can be addressed as part of #68?

pombredanne commented 2 years ago

@mnonnenmacher the approach here and in the ScanCode models is that one different package instance is returned for each different download URL. The rationale is that there are cases where the different download (here source and compiled binaries) can contain different set of bundled code and dependencies.

mnonnenmacher commented 2 years ago

Does this mean that any other values except for the download URL and the related hashes could be different between to such package instances? Because if not it just seems like unnecessary duplication. Anyway, feel free to close this as the ORT integration of python-inspector can handle the current output.