sassoftware / pymaven

pymaven is a library for working with maven repositories via python. pymaven is not intended as a complete replacement of the maven build system, but instead as a way for python programs to fetch artifacts and artifact dependencies from maven2 repositories.
Apache License 2.0
14 stars 11 forks source link

Clientless pom #8

Closed wfscheper closed 7 years ago

wfscheper commented 7 years ago

@pombredanne

Look over the API changes and let me know if you think this works for your use case. I'm still debating whether I really need to require a coordinate string, so I'd like to hear from you on that.

pombredanne commented 7 years ago

@wfscheper Let me review this in details tomorrow.

pombredanne commented 7 years ago

@wfscheper you wrote:

I'm still debating whether I really need to require a coordinate string, so I'd like to hear from you on that.

So there are eventually two use cases:

  1. I provide coordinates and some mechanism to fetch and resolve data (e..g. a client)
  2. I provide a POM file

For case 2, there is no such thing as coordinates known ahead of time: these are in the POM and unknown until after parsing

So what could possibly work is this: Update the constructor in https://github.com/wfscheper/pymaven/blob/01d41e6011e3f5281380230cb7bdbe17bab2c88e/pymaven/pom.py#L62 to: def __init__(self, coordinate=None, client=None, pom_data=None):

Then something more or less like this:

In all cases I never know coordinates ahead of time so in any subclass I would need to replicate most of the __init__ method.

pombredanne commented 7 years ago

I have been thinking of another possible approach.... So what about this?

  1. Pom is created with `init(self, coordinates, client=None)
  2. coordinates is string that is either a POM coordinates or a path to a local file. eg. something liek coordinates_or_path 2.1 if coordinates and not client, and we have POM coordinates, then init the class with just the data from the coordinates.
    2.2 if coordinates and not client, and we have a path in coordinates, then init the class with loading and parsing the file at the path stored in coordinates 2.3 if coordinates and client, and we have coordinates in coordinates, then init the class with letting the client fetch the POM file for coordinates 2.4 if coordinates and client, and we have a path in coordinates, then init the class with loading and parsing the file at the path stored in coordinates

Now, I am less clear about the case of having both a path and a client....

Also, after all a client is something that can retrieve Artifacts: there could be a dumb client that knows nothing about repositories and deals only with a path as an input and can only fetch that one path (and if provided with coordinates instead, it would just initialize a Pom/Artifact with the coordinates and fetch nothing) ... This could be another way. All food for thoughts

wfscheper commented 7 years ago

The more we discuss the Pom.__init__ signature, the more I think that the current implementation is an artifact of my original use case (which was bootstrapping the full dependency tree from just a coordinate string). That's no longer the primary use case.

I think Pom.__init__(pom_data, client=None) makes more sense for a general purpose Pom object, and we provide three classmethods that are Pom factories: Pom.parse(source, client=None), Pom.fromstring(text, client=None), and Pom.fromcoordinate(coordinate, client). pom_data should be an xml string, and the Pom.__init__ method will handle parsing it into an ElementTree.

Sound reasonable to you?

pombredanne commented 7 years ago

Sound reasonable to you? @wfscheper It sure does, but I am still not sure if makes sense to recreate any XML string though. It feelds weird when the input is coordinates and would be imposed if the constructor of Pom is Pom.__init__(pom_data, client=None)

wfscheper commented 7 years ago

Not completely sure what you mean by "recreate any XML string." Are you referring to where the PR was using the EMPTY_POM template? That would go away with the new signature.

On Sun, Jun 11, 2017 at 10:27 AM Philippe Ombredanne < notifications@github.com> wrote:

Sound reasonable to you? @wfscheper https://github.com/wfscheper It sure does, but I am still not sure if makes sense to recreate any XML string though. It feelds weird when the input is coordinates and would be imposed if the constructor of Pom is Pom.init(pom_data, client=None)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sassoftware/pymaven/pull/8#issuecomment-307632393, or mute the thread https://github.com/notifications/unsubscribe-auth/AAD5s-q3n9cwb3sacMf4ka-35XqoS1baks5sC_lbgaJpZM4NzN06 .

-- Sent from my phone

pombredanne commented 7 years ago

Are you referring to where the PR was using the EMPTY_POM template? That would go away with the new signature.

perfect!

pombredanne commented 7 years ago

@wfscheper to give you a better feel of where I want to go (and what I would like to contribute back eventually for #3 #9 #10 #11 and #12 ) ... See these:

Note: this is based of master and not this branch, but merging will not be of issue and I will rebase as needed when times comes.

Note that I have over 500+ tests with several POMs with oddities that were collected mostly carefully.

pombredanne commented 7 years ago

@wfscheper any feedback on the subclass I mentioned above? I can start creating a PR, but I would rather base this on top of your refactoring than master I guess.

wfscheper commented 7 years ago

Just catching up. There's a lot here that I like, especially the tests and odd corner case poms.

I'm going to merge my pom refactor into a new branch in the main repo and we can start pulling in your changes there. I'm still unsatisfied with the Pom API, and I expect we'll want to change the API some more as we work through your enhancements. On Wed, Jun 21, 2017 at 6:12 PM Philippe Ombredanne < notifications@github.com> wrote:

@wfscheper https://github.com/wfscheper any feedback on the subclass I mentioned above? I can start creating a PR, but I would rather base this on top of your refactoring than master I guess.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sassoftware/pymaven/pull/8#issuecomment-310220168, or mute the thread https://github.com/notifications/unsubscribe-auth/AAD5s66UARQ1jlCDcxwYHWXVHKp9qnniks5sGZVUgaJpZM4NzN06 .

-- Sent from my phone

pombredanne commented 7 years ago

@wfscheper this sounds like a plan! :+1:

pombredanne commented 7 years ago

@wfscheper FWIW my patches have been baked in the latest release of scancode https://github.com/nexB/scancode-toolkit/releases/ .... and I would like to know if this is OK to have your name in the credits :) as this is a major improvement on our side for the Maven thingies

pombredanne commented 7 years ago

and FYI to give some context of what we are up to:

  1. here we are collecting package info: https://github.com/nexB/scancode-toolkit/tree/develop/src/packagedcode for Maven and many (eventually) most package format. This includes direct deps and any other deeper or pinned dep (when available in some lockfile of sorts)
  2. https://github.com/nexB/dependentcode/ will eventually use this data as an input to perform a full dep resolution (using a sat solver)

So the goals are 1. detect all the packages and all deps in a codebase. 2. report all the licensing and origin info. 3. report all known vulnerabilities for all detected or resolved packages (as part of https://github.com/nexB/vulnerablecode which is built by @kartiksibal as part of the GSoC https://summerofcode.withgoogle.com/organizations/4800434830049280/#4673431489478656 ) 4. report other useful info on this

wfscheper commented 7 years ago

Closing. We will continue work on this in the clientless-pom branch of the main repo, and merge that in once its fully baked.