aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!
https://aboutcode.org/scancode/
2.13k stars 551 forks source link

Parse build.gradle special dependency #3015

Closed xu1119 closed 2 years ago

xu1119 commented 2 years ago

Description

When analyse dependeices of project pygradle, results and purls from file build.gradle are wrong, such as pkg:maven/pypi/numpy@1.11.2 Following are the file and part of the result:

plugins {
  id "com.linkedin.python-cli" version "0.4.9"
}

version=1.0

// we need to define an explicit installation sequence for the dependencies,
// see issue 75 here: https://github.com/linkedin/pygradle/issues/75
project.tasks.findByName('installPythonRequirements').sorted = false

dependencies {
    python 'pypi:numpy:1.11.2'
    python 'pypi:pandas:0.19.1'
    python 'pypi:scipy:0.18.1'
    python 'pypi:scikit-learn:0.18'
    // pandas depends on pytz>=2011k, which cannot be satisfied automatically
    // since 2011k does not conform to correct versioning (see
    // http://stackoverflow.com/questions/18230956/could-not-find-a-version-that-satisfies-the-requirement-pytz)
    // therefore, explicitly include dependencies here:
    python 'pypi:pytz:2016.4'
    python 'pypi:python-dateutil:2.6.0'
}

repositories {
    pyGradlePyPi()

    // as LinkedIn only provides an initial set of pypi libraries along with Ivy-Metadata,
    // we will use a local repository which is populated with libraries and metadata using
    // the pivy-importer.
    ivy{
        url "/tmp/repo"
        layout 'pattern' , {
            artifact '[organisation]/[module]/[revision]/[artifact]-[revision](-[classifier]).[ext]'
            ivy '[organisation]/[module]/[revision]/[module]-[revision].ivy'
        }
    }
}
"dependencies": [
    {
      "purl": "pkg:maven/com.linkedin.pygradle/pygradle-plugin@0.7.1-SNAPSHOT",
      "extracted_requirement": "0.7.1-SNAPSHOT",
      "scope": "classpath",
      "is_runtime": true,
      "is_optional": false,
      "is_resolved": true,
      "resolved_package": {},
      "dependency_uid": "pkg:maven/com.linkedin.pygradle/pygradle-plugin@0.7.1-SNAPSHOT?uuid=ead751e7-f580-4452-96ab-6b48dd76326f",
      "for_package_uid": null,
      "datafile_path": "pygradle/examples/example-project/build.gradle",
      "datasource_id": "build_gradle"
    },
    {
      "purl": "pkg:maven/pypi/numpy@1.11.2",
      "extracted_requirement": "1.11.2",
      "scope": "python",
      "is_runtime": true,
      "is_optional": false,
      "is_resolved": true,
      "resolved_package": {},
      "dependency_uid": "pkg:maven/pypi/numpy@1.11.2?uuid=a042025e-b602-44f9-9d55-7e83b0cad9fe",
      "for_package_uid": null,
      "datafile_path": "pygradle/examples/iris-classification/build.gradle",
      "datasource_id": "build_gradle"
    },
    {
      "purl": "pkg:maven/pypi/pandas@0.19.1",
      "extracted_requirement": "0.19.1",
      "scope": "python",
      "is_runtime": true,
      "is_optional": false,
      "is_resolved": true,
      "resolved_package": {},
      "dependency_uid": "pkg:maven/pypi/pandas@0.19.1?uuid=8d523f75-48f6-4b8a-b3a2-d43caf2228fc",
      "for_package_uid": null,
      "datafile_path": "pygradle/examples/iris-classification/build.gradle",
      "datasource_id": "build_gradle"
    },

How To Reproduce

scancode -p --json-pp - pygradle/

System configuration

pombredanne commented 2 years ago

@xu1119 Thanks for the report! This pygradle plugin defines its own dependencies semantics. It should be not hard to recognize what we extract as a "python" scope and a "pypi" namespace in python 'pypi:numpy:1.11.2' to get a proper purl.

That said, do you how prevalent and common pygradle would be? I asked the authors at https://github.com/linkedin/pygradle/issues/357 FWIW

pombredanne commented 2 years ago

I am copying here this feedback from upstream pygradle author in https://github.com/linkedin/pygradle/issues/357

pombredanne commented 3 days ago

I have a request to better scan the dependency style of pygradle in nexB/scancode-toolkit#3015 and I was wondering if this project is still actively maintained as the last commit was about two years ago. Thanks!

@warsaw commented 3 days ago

No, this project really isn't still actively maintained.

Also there are only a handful of pygradle projects I can find in the wild: https://github.com/search?q="dependencies+{python+'pypi"&type=code and https://github.com/search?q="pyGradlePyPi"&type=Code

So based on this, I wonder if doing anything special is worth it? How often do you see pygradle used in the wild?

In any case since this would be a tiny code change so I could be talked into accepting a small, focused patch with tests to handle these (rarer) cases.

xu1119 commented 2 years ago

Thanks for doing these. I found this just when I scan some java gradle project searched from github sorted by stars. Based on the search results you found and feedback from pygradle author, I think it may not worth to change.

pombredanne commented 2 years ago

@xu1119 you wrote:

I found this just when I scan some java gradle project searched from github sorted by stars.

I would be interested if you can elaborate a bit on this? Also if you find any other issue while parsing gradle build files.

xu1119 commented 2 years ago

I find some open source SCA tools, and compare results generated by these tools. So I search different projects from github. Other projects scancode can normally analyze, pygradle is special in results.

pombredanne commented 2 years ago

@xu1119 you wrote:

I find some open source SCA tools, and compare results generated by these tools.

If you can share more results this would be awesome as this will help improve and/or fix bugs here!