LiUSemWeb / HeFQUIN

HeFQUIN is a query federation engine for heterogeneous federations of graph data sources, including federations of knowledge graphs.
https://liusemweb.github.io/HeFQUIN/
Apache License 2.0
19 stars 2 forks source link

publish maven dependency #365

Open keski opened 1 month ago

keski commented 1 month ago

This issue is about publishing HeFQUIN releases using GitHub Packages as a Maven repository. To use HefQUIN as dependency would then require something like:

<repositories>
    <repository>
        <id>github</id>
        <name>GitHub LiUSemWeb Repository</name>
        <url>https://maven.pkg.github.com/LiUSemWeb/HeFQUIN</url>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>se.liu.ida.hefquin</groupId>
        <artifactId>hefquin-engine</artifactId>
        <version>x.y.z</version> 
    </dependency>
</dependencies>

I'd suggest adding an action that is triggered automatically whenever a new release is published.

Note: Maven releases are immutable also when using GitHub Packages.

@hartig

hartig commented 1 month ago

Good idea, but I have a few questions to better understand what this entails:

Would this mean that, if I do mvn deploy in my local copy of the repo, then the package will be uploaded and published at GitHub Packages? If yes, who else besides me would be able to do that?

Regarding the immutability that you speak of, does it mean doing mvn deploy a second time with the same version number would result in an error message?

keski commented 1 month ago

Good idea, but I have a few questions to better understand what this entails:

Would this mean that, if I do mvn deploy in my local copy of the repo, then the package will be uploaded and published at GitHub Packages? If yes, who else besides me would be able to do that?

No, doing mvn deploy on a local copy would not publish to GitHub Packages. Just like the publishing of gh-pages, this would be an action triggered by an event on the repo (release). Anyone with permissions to create a new release would in principle be able to indirectly trigger this.

Regarding the immutability that you speak of, does it mean doing mvn deploy a second time with the same version number would result in an error message?

Exactly, so when creating releases we should adhere to, e.g., semantic versioning.

hartig commented 1 month ago

Thanks for answering my questions. Sounds like a good idea then!

keski commented 1 month ago

After a lot of experimentation I have a few things to add to this issue:

1) Artifact IDs should be lowercase (i.e., HefQUIN -> hefquin) to comply with Maven conventions (it's even a strict requirement in some systems). So to avoid issues in the future, we should change to lowercase.

2) It turns out that using GitHub Packages as a Maven repository requires authentication... I believe it has something to do with how GitHub handles public vs. private repos, but either way it is quite limiting to require users to authenticate to use an open source library. Interestingly, this is not the case when hosting docker images. Anyways, this means that it's probably better to publish to using Sonatype and the Maven Central Repository.

3) Sonatype is transitioning from to a new publishing system (see central portal vs. legacy OSSRH). In terms of functionality, the main difference seems to be that the new system currently does not support SNAPSHOT versions. We could use the legacy system but it seems like we would have to transition eventually anyways.

4) Maven Central Repository requires that a few extra details are added to the main pom.xml file (see requirements).

5) The process for starting to publish to the central repository is bit involved (see summary here). The basic steps would be:

I was thinking that we could set up a common Maven Central account for the group LiUSemWeb, which we can manage as a group. That way the publishing steps will persist even if the group members change in future and we don't have to share any personal tokens etc. Also, an email can only be used to register a single Sonatype account. If this sounds like a good idea, do we have some email address that could be used for this or should we create a new one? What domain?

@hartig

hartig commented 1 month ago
  1. Artifact IDs should be lowercase (i.e., HefQUIN -> hefquin) to comply with Maven conventions (it's even a strict requirement in some systems). So to avoid issues in the future, we should change to lowercase.

Okay, we can do that.

In this case, would it make sense to change the groupId from se.liu.ida.hefquin to se.liu.ida? Maybe not, or what do you think?

  1. It turns out that using GitHub Packages as a Maven repository requires authentication [...] Anyways, this means that it's probably better to publish to using Sonatype and the Maven Central Repository.

Okay

  1. Sonatype is transitioning from to a new publishing system (see central portal vs. legacy OSSRH). In terms of functionality, the main difference seems to be that the new system currently does not support SNAPSHOT versions. We could use the legacy system but it seems like we would have to transition eventually anyways.

We wouldn't publish SNAPSHOT versions anyways, would we?

  1. Maven Central Repository requires that a few extra details are added to the main pom.xml file (see requirements).

Can you please create a PR with an initial version of these changes. Then, I can take a look and expand/adjust as I see fit. (The PR may also contain the change as per point 1 above)

I was thinking that we could set up a common Maven Central account for the group LiUSemWeb, which we can manage as a group. That way the publishing steps will persist even if the group members change in future and we don't have to share any personal tokens etc.

What do you mean by personal tokens? I guess, among the (three senior) seniors in the group, the only one who would care about having the option to publish to Maven Central is me, and I am not planning to have my status as a group member change in the future ;-) So, in principle, we could also choose that I create the account with the understanding that this is the account for the group. But then, what would the personal tokens be?

Also, an email can only be used to register a single Sonatype account.

Is such an account then only for Maven packages for one project or can it be for multiple?

If this sounds like a good idea, do we have some email address that could be used for this or should we create a new one? What domain?

If we don't use my email address, then it should be one for the group, preferably with @liu.se. But as we don't have such an email address, I am afraid trying to get one may be more effort than simply using my LiU email address.

keski commented 1 month ago
  1. Artifact IDs should be lowercase (i.e., HefQUIN -> hefquin) to comply with Maven conventions (it's even a strict requirement in some systems). So to avoid issues in the future, we should change to lowercase.

Okay, we can do that.

In this case, would it make sense to change the groupId from se.liu.ida.hefquin to se.liu.ida? Maybe not, or what do you think?

Jena uses Apache Jena uses jena as its artifact ID so maybe we can probably keep the groupId as is:

  <name>Apache Jena</name>
  <groupId>org.apache.jena</groupId>
  <artifactId>jena</artifactId>
  <packaging>pom</packaging>
  ...
  1. Sonatype is transitioning from to a new publishing system (see central portal vs. legacy OSSRH). In terms of functionality, the main difference seems to be that the new system currently does not support SNAPSHOT versions. We could use the legacy system but it seems like we would have to transition eventually anyways.

We wouldn't publish SNAPSHOT versions anyways, would we?

I agree.

  1. Maven Central Repository requires that a few extra details are added to the main pom.xml file (see requirements).

Can you please create a PR with an initial version of these changes. Then, I can take a look and expand/adjust as I see fit. (The PR may also contain the change as per point 1 above)

Will do.

I was thinking that we could set up a common Maven Central account for the group LiUSemWeb, which we can manage as a group. That way the publishing steps will persist even if the group members change in future and we don't have to share any personal tokens etc.

What do you mean by personal tokens? I guess, among the (three senior) seniors in the group, the only one who would care about having the option to publish to Maven Central is me, and I am not planning to have my status as a group member change in the future ;-) So, in principle, we could also choose that I create the account with the understanding that this is the account for the group. But then, what would the personal tokens be?

Okay. So maybe it's just overkill to automate the maven publishing step? We could instead just make sure that the process is properly documented and then you can authenticate by modifying ~/.m2/settings.xml locally on your machine. I'll look into the details and try it for my self in an example project and get back to you.

Also, an email can only be used to register a single Sonatype account.

Is such an account then only for Maven packages for one project or can it be for multiple?

Managing multiple projects with a single project is fine, but we can't create a dedicated LiUSemWeb account using our own email addresses.

If this sounds like a good idea, do we have some email address that could be used for this or should we create a new one? What domain?

If we don't use my email address, then it should be one for the group, preferably with @liu.se. But as we don't have such an email address, I am afraid trying to get one may be more effort than simply using my LiU email address.

Let's use yours then. I'll get back with details.

hartig commented 1 month ago

Sounds all good. I will wait for you to get back with details then.

(And we keep the the groupId as se.liu.ida.hefquin)

keski commented 1 month ago

I've now finally managed to publish an example project using Maven Central! I was a bit taken aback by how complicated it actually was, but I've taken note of all the steps so next time should be much easier, and updated the Release-Logistics.

The key takeaways are the following:

According to the Maven Central docs, multiple users should be able to share a namespace but there seems to be no way of adding users manually, so I think it's best that @hartig claims the namespace.

@hartig, perhaps we can schedule a time for setting it up on your computer once we now how to manage deal with namespace/subdomain question?

keski commented 1 month ago

For testing purposes, I've published HeFQUIN in my own github namespace. The libs work as expected and the modules can be imported separately. E.g.:

  <dependency>
    <groupId>io.github.keski.hefquin</groupId>
    <artifactId>hefquin-cli</artifactId>
    <version>0.0.1</version>
  </dependency>

I've update the pom.xml file in https://github.com/LiUSemWeb/HeFQUIN/tree/365-publish-maven-dependency. Once we we have the namespace we should be ready to publish.

hartig commented 1 month ago

Thanks for your efforts. Let's wait for IT to respond to your ticket.

In the meantime, I'm repeating my thoughts from our brief discussion yesterday about some of the other points, just for the record:

hartig commented 2 weeks ago

@keski any update on this one?

keski commented 21 hours ago

@hartig I got an update regarding this! If we want to use a subdomain directly under ida.liu.se we apparently need approval from the prefect at IDA. But another suggestion was to use the domain research.liu.se which is used for something similar in two other projects. We would then use something like semweb.research.liu.se. What do you think?