eclipse-archived / ceylon-herd

The Ceylon repository web application
Apache License 2.0
21 stars 11 forks source link

Interoperability with Ivy and Maven dependency resolvers #262

Open ckulenkampff opened 8 years ago

ckulenkampff commented 8 years ago

This enhancement would allow to create a flat classpath of Ceylon CARs for Java projects using Maven, Ivy or Gradle. This is possible by offering appropriate repository "facades" through the Herd repository server.

Maven repository structure Maven expects the following layout (see Maven Repository Layout - Final) for primary artifacts: /$groupId[0]/../${groupId[n]/$artifactId/$version/$artifactId-$version.$extension and for secondary artifacts: /$groupId[0]/../$groupId[n]/$artifactId/$version/$artifactId-$version-$classifier.$extension

Ivy repository structure Ivy is more flexible and allows to specify custom patterns for artifact resolution (see Ivy Documentation - Main Concepts). The default patten that is used by Gradle is the following (see Gradle DSL Reference - IvyArtifactRepository): Artifacts: $baseUri/[organisation]/[module]/[revision]/[type]s/[artifact](.[ext]) Ivy module descriptors: $baseUri/[organisation]/[module]/[revision]/[type]s/[artifact](.[ext])

Meta information To resolve transitive dependencies both repository types require meta information. Maven uses pom.xmls. Ivy uses ivy.xmls, but can also process pom.xmls. Those files must be accessible via HTTP requests.

Meta information augmentation When the repository server responds to a "foreign" meta data request for a Ceylon module, it should automatically add all implicit dependencies of the Ceylon language to the response. For interoperability these Ceylon language modules should be published to the Herd repository so that Java projects that depend on a Ceylon library do not have to provide them by themselves.

Artifact aliases For interoperability it would be very useful when CAR files are also available under the same name but with JAR file extension when accessed through a facade.

Many IDEs automatically link source and javadoc JARs to the downloaded artifacts by searching in the module cache for files like $artifactId-$version-$classifier-sources.$extension (IDE dependent see NetBeans DependencyNode). Ceylon source artifacts should be made available in a way that this resolution works out of the box. This means that the artifacts are made available under another name than they are normally accessible in Herd.

gavinking commented 8 years ago

Don't tell people they're wrong

They're wrong. There. I told them.

Seriously. This path you want to go down is starting to scare me, and turn me off the whole idea. If you can come up with a solution that doesn't involve doing nasty horrible polluting things and inventing some over-flexible system designed to accommodate every possible whim that somebody might someday have, then I'm much more open to the idea.

gavinking commented 8 years ago

Remember: group id and artifact id are two totally arbitrary strings with no interesting semantics. The group id could be "foo" for all Ceylon modules and everything would still work.

quintesse commented 8 years ago

with no interesting semantics

That's not true, for Maven and Ivy etc this is what defines your ownership. You can't just go assigning arbitrary strings.

renatoathaydes commented 8 years ago

@gavinking correct in the Ceylon world, but I can't publish to Maven central under any other groups than the one(s) I own... so I need to declare my groupID... I have to say I agree with Gavin, go for the simplest possible solution that may work.. which seems to be that, when in the Maven world, groupId is anything before the last . in the module name? so no magic in the Ceylon side.... if people want magic, I could add some magic to the Gradle plugin so they can publish poms to Maven repos with any groupdId:artifactId they want... and the Maven plugin could do the same.

quintesse commented 8 years ago

@renatoathaydes that might cause problems when what you published in Maven has a different name of that what you published on the Herd. Eg. for the SDK modules on the Herd we won't know that they're called org.ceylon.lang:xxx in Maven. So they become different modules when obtained via Maven or via the Herd. That doesn't seem like a good idea.

Edit: "obtained via Maven or via the Herd" -> "obtained via Maven Central or via the Herd Maven interop"

renatoathaydes commented 8 years ago

@quintesse Perhaps you should consider using different rules for Ceylon's SDK modules and people's modules... but if people decide to publish things with different names in different repos, they will have problem with any build system in the world.

quintesse commented 8 years ago

but if people decide to publish things with different names in different repos

@renatoathaydes exactly the reason why putting this information in the module descriptor would be a good idea ;)

In this case it's not them using different names, it's us. If they put their modules in Maven using their officially registered group name and also publish to Herd where we will possible auto-generate a different group name according to our own rules then no matter what they do they're up shit creek.

renatoathaydes commented 8 years ago

Tricky... seems like the only way to avoid trouble is to add a group annotation to the module descriptor... and if the module has no such annotation, it won't be considered for look up in Maven repos?

renatoathaydes commented 8 years ago

instead of a group annotation, maybe the current solution to declare Maven dependencies with a String as in "group:name" "1.0" could be considered as well... for consistency sake.

ckulenkampff commented 8 years ago

Yeah, a decision will severely affect Java-Ceylon-interaction in the long run.

For what purpose is the pom.xml in the CAR generated by Ceylon in the first place?

Maybe the best way to deal with it, is to not generate the pom.xml automatically and to ditch the Maven interoperability in Herd that I proposed here.

All projects that want to be available in Maven would have to add an additional publishing process to their build that generates a pom file and uploads their CAR as JAR to Maven. To make this work you would have to publish Ceylon core stuff also to Maven.

edit: Ah forget it. Then Ceylon projects that use Herd as repository would suffer, because when they use a Java library from Maven that uses a Ceylon library from Maven (which is also used directly by the project with Herd coordinate) CMR would not get that.

ckulenkampff commented 8 years ago

Just for the record, the artifact id is as important as the group id. Both cannot be safely deduced from the Ceylon module name. They are simply two different things. The Ivy Maven interop works so well because Ivy has organization and module name as coordinates which is similar to Mavens coordinates (but also not the same). See http://ant.apache.org/ivy/history/latest-milestone/ivyfile.html / http://ant.apache.org/ivy/history/latest-milestone/ivyfile/info.html

This got complicated very quickly :/ Should we add another issue that deals with Maven / Ceylon artifact coordinate interoperability in another project issue tracker?

gavinking commented 8 years ago

I think what everyone is forgetting here is that the person writing and compiling the module is not the person who wants to depend on the module using mvn. Therefore a system which depends on the module author (optionally) annotating the module at compile time is unlikely to work well in practice.

bjansen commented 8 years ago

Perhaps this has already been discussed in a previous comment, but we're talking about serving a Maven repo from Herd, so why not configure the group id/artifact id in Herd directly? This can be during the project claim, or during the project upload, with default values extracted from the module name.

pros:

cons:

quintesse commented 8 years ago

Therefore a system which depends on the module author (optionally) annotating the module at compile time is unlikely to work well in practice.

I don't know why not. I see two situations:

We should definitely not support some weird situation where a user of the module decides to publish it on Maven Central!

quintesse commented 8 years ago

@bjansen hmmm not a bad idea at all!

Although honestly I don't like much the fact that 2 (or N, for each time the author decides to change the coordinates in the Herd) different versions of all CARs would exist. That would make checking CRCs a lot harder ("I have 2 sha1s here, one of them should probably check out")

bjansen commented 8 years ago

Do we really need two versions? We can rename the CAR to JAR and not modify the pom.xml inside, Maven will not use anyway (it will use the sibling pom). And I suppose the maven layout will reside in a completely different folder, right?

bjansen commented 8 years ago

Of course this does not solve the problem of uploading things to Maven central, because Maven coordinates won't be known locally. We could add a "publish to maven central" button in Herd, though.

thradec commented 8 years ago

I think, that to have 2 different pom for one car is pretty mess.

2016-03-09 14:40 GMT+01:00 Bastien Jansen notifications@github.com:

Do we really need two versions? We can rename the CAR to JAR and not modify the pom.xml inside, Maven will not use anyway (it will use the sibling pom). And I suppose the maven layout will reside in a completely different folder, right?

— Reply to this email directly or view it on GitHub https://github.com/ceylon/ceylon-herd/issues/262#issuecomment-194300132.

bjansen commented 8 years ago

Can someone please answer this?

For what purpose is the pom.xml in the CAR generated by Ceylon in the first place?

quintesse commented 8 years ago

Maven will not use anyway

If that's true then okay. It might be weird to see that Ceylon's metamodel would return different names than the one you asked for but we could probably live with that.

quintesse commented 8 years ago

Can someone please answer this?

As far as I know because that's what many (all?) Maven artifacts do too, so I guess it's used somewhere. But I think @FroMage will know better

gavinking commented 8 years ago

we're talking about serving a Maven repo from Herd, so why not configure the group id/artifact id in Herd directly?

Right, that's exactly what I was getting at in my last comment. Stef has already decided to generate a pom using Herd itself (rather than just extracting it from the car). So it doesn't seem unreasonable to have Herd handle this stuff as well.

FroMage commented 8 years ago

That's possible, but it does not make it easier for people to upload to Maven Central without Herd in the equation.

gavinking commented 8 years ago

It's not really clear to me how and when and why the "upload to Maven Central" requirement snuck into this particular issue. Originally we were talking about letting Herd act as a mvn repo.

ckulenkampff commented 8 years ago

Maybe it helps to list use cases and possible problems:

1) As a Java developer, I want to develop a Java Gradle project and want to use dependencies only available on Herd.

Currently, I can add the Herd repository as Maven repository in Gradle. This works fine for now.

2) As a Java developer, I want to develop a Java Gradle project that uses dependencies only available on Herd and some Maven Central dependencies.

Currently, I can add both, a Maven repository and the Herd repository, to my Gradle project. This works fine for now.

3) As a Java developer, I want to develop a Java Gradle project that uses a Ceylon library only available at Herd with a transitive dependency to com.fasterxml.jackson.core. My project also uses a big Java library which also depends on the Maven variant com.fasterxml.jackson.core:jackson-core.

Currently, I can add both, a Maven repository and the Herd repository, to my Gradle project. But the jackson-core dependency will be downloaded two times with different names and pollutes my classpath.

There are similar use cases for Ceylon developers where different names should result in multiple downloads and imports of the same library (even if Maven fake mode of Herd is ditched, at least when https://github.com/ceylon/ceylon/issues/5968 is solved).

One can always resolve these issues manually, but it makes Ceylon Herd look like an offender.

The only solution to this is to somehow link dependencies on Maven Central with dependencies on Herd (when they are uploaded to both). When maintaining this link is manual work, many people will not do it, which in turn results in ugly problems mentioned in 3). One solution would be to allow Herd to push my library to Maven Central or another popular Maven repository. This way the link will be automatically maintained by Herd.

This seems also to be what JCenter makes possible for its users:

And if you're into legacy, you can even synchronize your packages directly to Maven Central.

BTW similar issues might occur when working on NPM interoperability, but I am not sure...

ckulenkampff commented 8 years ago

Just an update: Tested it with a local Sonatype Nexus installation. I could add the Herd repository as proxy. image I think the following configuration was necessary to make it work better: image

luolong commented 8 years ago

Now, there's been so much discussion here hat I might just have skimmed over some relevant bits. If so, excuse me if I rehash something that has already been decided.

About my suggestion for setting Maven groupId:artifactId identifiers pair -- The way I see this is that we have to have a sensible default that does not make people surprised if they do the default thing.

Then we should have some way of overriding these defaults. First on the command line and then for perpetuating the chosen setting in a configuration file somewhere (.ceylon/config seems good place)

The defaults

The most sensible default behavior I can come up with is to have group id and artifact id to be same as module id.

I know that this runs counter to how Ceylon modules have been published to maven so far, but if this is configurable later, I still think that this is the most sensible default.

Command line

One should be able to configure maven groupId and artifactId on the command line. In its simplest form, we could just set those up as plain text values. If a project has just one module, this should be simple enough:

ceylon compile --maven-groupId=ceylon-lang

The above command should produce set of modules whose groupId is set to ceylon-lang and artifactId is full module name

ceylon compile --maven-groupid=ceylon-lang --maven-artifactid=interop-java ceylon.interop-java

would produce ceylon module with maven coordinates 'ceylon-lang:interop-java:1.2.2'

Trying to call the latter command line with multiple modules (either explicitly on command line, or implicitly when calling without module name in a project with multiple modules) should be considered an error.

As an additional consideration, we could add support for some simplified regular expression syntax for deriving groupId and artifactId from the module name. Something along these lines:

ceylon compile --maven-groupid=ceylon-lang --maven-artifactId="ceylon.(**)|$1|r/./-"

That would take everything after "ceylon." in the module name and use it as Maven artifactId ... The exact syntax of the regular expression is kind of irrelevant at this point, but it should be simple enough to be able to do limited set of operations on the initial input string (module name):

Project configuration

Whatever we allow on command line we should be able to configure in .ceylon/config file. With additional freedom of being able to override settings on per-module basis.

I guess the simplest configuration might be like this:

[defaults]
encoding=UTF-8
overrides=build/overrides.xml

maven-groupid=ceylon-lang
maven-artifactId=ceylon.(**)|$1
...

[module "ceylon.interop.java"]
maven-artifactId=interop-java
vietj commented 8 years ago

this seems to be a great debate, however I haven't see anything about cyclic dependencies.

how are cyclic module dependencies going to be exposed by Herd to Maven ?

luolong commented 8 years ago

I guess we have several layers of maven/ivy interop issues under discussion here ... maybe we should split those discussions off to their own separate issues?