commons-rdf / commons-rdf

Common Java interfaces for RDF-1.1 libraries, now in Apache Incubator
https://github.com/apache/incubator-commonsrdf
Apache License 2.0
29 stars 10 forks source link

Apache Incubator proposal #59

Closed wikier closed 9 years ago

wikier commented 9 years ago

Here a very initial draft to be discussed (cc @afs, @ansell, @stain, @westei, etc.):

= Apache Commons RDF incubation proposal =

== Status ==

Draft

== Abstract ==

Commons RDF a set of interfaces for the RDF 1.1 concepts that can be used to expose common RDF-1.1 concepts using common Java interfaces.

== Proposal ==

The main motivation behind this simple library is revise an historical incompatibility issue. This library does not pretend to be a generic api wrapping those libraries, but a set of interfaces for the RDF 1.1 concepts that can be used to expose common RDF-1.1 concepts using common Java interfaces. In the initial phase commons-rdf is focused on a subset of the core concepts defined by RDF-1.1 (URI/IRI, Blank Node, Literal, Triple, and Graph). In particular, commons RDF aims to provide a type-safe, non-general API that covers RDF 1.1. In a future phase we may define interfaces for Datasets and Quads.

The goal is to provide a compact API that could be implemented by the upcoming versions of the main Java toolkits (Apache Jena 3.0 and OpenRDF Sesame 4.0) as well as for other libraries (OWLAPI) and other JVM languages (Banana RDF and so on).

In addition, the project could provide some basic implementations suitable for some basic scenarios. But the major and established Java toolkits will always remain the recommend implementations to use.

== Background ==

In the Java world there has been historically an incompatibility issue between the two major RDF toolkits: Apache Jena and OpenRDF Sesame. Many libraries have tried to wrap them, but besides technical considerations, they normally end up being unmaintained.

We came up with the idea of Commons RDF for going to the root of the problem. The project is healthy developing at GitHub. The natural path to Apache Commons Sandbox has been studied, but we think that in this phase of the project is better to have a more focused community and infrastructure. However that's still the goal, as soon as the API has achieve the required maturity.

== Rationale ==

The library comes from the need for providing a generic and neutral API for RDF 1.1 that everybody can transparently use without bounding the design to concrete implementations. It is the result of cooperation between contributors to the main Java toolkits, and will try to be available in a timely manner to influence the major version updates Jena 3.0 and Sesame 4.0.

== Initial Goals ==

== Current Status ==

The API is already quite agreed by all involved players, and it's been started to be prototyped, both by the major toolkits and by simple implementations.

=== Meritocracy ===

Commons RDF has been completely designed by committee using git workflows, where every single feature has been discussed based on a Pull Request. We plan to keep such methodology where the commons understanding comes first than personal decisions.

=== Community ===

Commons RDF addresses the developers who are working with Semantic Web technologies in the JVM. The initial committers are core contributors to that community.

=== Core Developers ===

=== Alignment ===

Commons RDF comes to help in the integration of the different ASF projects using RDF technologies, where Apache Jena can be integrated with others which use Sesame (Any123 and Marmotta). In addition, proposals by other projects (Clerezza and Stanbol) can be also aligned.

== Known Risks ==

=== Orphaned Products ===

Probably one of the major risks will that the API provided does not fit well in the development plan of the main Java toolkits. But we try to minimize such risk by having on board core developers of those framework, the API will live or die on its own merits.

=== Inexperience with Open Source ===

The committers have large experience with open source development and ASF communities.

=== Homogeneous Developers ===

The initial list of developers come from four different organizations and three different counties.

=== Reliance on Salaried Developers ===

Although the project is also in the strategic agenda of project of our current employers, so far the main development is happening at volunteer time.

=== Relationships with Other Apache Projects ===

The project really relates with Jena as one of the potential implementations, with Any23 and Marmotta which are based on Sesame, and Clerezza and Stanbol as project that may benefit of such common api.

=== An Excessive Fascination with the Apache Brand ===

While we expect the Apache brand may help attract more contributors, our interests in starting this project is based on the factors mentioned in the Rationale section.

== Documentation ==

Documentation for the current project can be found at GitHub: http://commons-rdf.github.io

== Initial Source ==

The current source code can be found at GitHuh: https://github.com/commons-rdf/commons-rdf

=== Source and Intellectual Property Submission Plan ===

The whole copyright is hold by the four developers signing this proposal, all of them already with a ICLA with ASF in place. Current licence is already Apache Software License 2.0.

=== External Dependencies ===

Most of current dependencies should have Apache compatible licenses, including BSD, CDDL, CPL, MPL and MIT licensed dependencies. We are aware of some incompatible licenses right now, but we will work to solve this issue. See Appendix A for a detailed list of dependencies.

=== Cryptography ===

Does Not Apply.

== Required Resources ==

=== Mailing lists ===

=== Repository ===

=== Issue Tracking ===

=== Other Resources ===

== Initial Committers ==

=== Affiliations ===

== Sponsors ==

=== Champion ===

=== Nominated Mentors ===

=== Sponsoring Entity ===

Apache Incubator PMC

== Appendix A: list of dependencies ==

TBC

stain commented 9 years ago

Together, we came up with the idea of Commons RDF for solving the incompatibility problem. The community has been in healthy development at GitHub, including participants from the major Java RDF toolkits.

The natural path to Apache Commons Sandbox has been studied, but we think that in this phase of the project, which focuses on the API design and activelty involves the developers of existing toolkits, it is better to have a more focused community and infrastructure. Rather than a new Top-Level Project, the goal is still to graduate as part of Apache Commons, that is when API has achieve the required maturity and the project goes into maintenance mode.

Part of the motivation for doing the incubator process would therefore be to bring together the existing Commons RDF community in the Apache Way, mature the API, and then gradually prepare the Commons RDF community for working within the larger Apache Commons community.

stain commented 9 years ago

I could not find any incompatible licenses. All our dependencies, except Guava, are build dependencies which are not as restricted as they won't go into the distribution.

Third-party dependencies (transitional for Java 6/7 support):

Test dependencies:

Maven plugins:

ansell commented 9 years ago

Even Guava is a Java-6 only dependency, and given the difficulty maintaining the Java-6 branch we may not continue with it for much longer.

@wikier My affiliation is now CSIRO, not University of Queensland.

wikier commented 9 years ago

OK, second draft addressing comments:

= Apache Commons RDF incubation proposal =

== Status ==

Draft

== Abstract ==

Commons RDF a set of interfaces for the RDF 1.1 concepts that can be used to expose common RDF-1.1 concepts using common Java interfaces.

== Proposal ==

The main motivation behind this simple library is revise an historical incompatibility issue. This library does not pretend to be a generic api wrapping those libraries, but a set of interfaces for the RDF 1.1 concepts that can be used to expose common RDF-1.1 concepts using common Java interfaces. In the initial phase commons-rdf is focused on a subset of the core concepts defined by RDF-1.1 (URI/IRI, Blank Node, Literal, Triple, and Graph). In particular, commons RDF aims to provide a type-safe, non-general API that covers RDF 1.1. In a future phase we may define interfaces for Datasets and Quads.

The goal is to provide a compact API that could be implemented by the upcoming versions of the main Java toolkits (Apache Jena 3.0 and OpenRDF Sesame 4.0) as well as for other libraries (OWLAPI) and other JVM languages (Banana RDF and so on).

In addition, the project could provide some basic implementations suitable for some basic scenarios. But the major and established Java toolkits will always remain the recommend implementations to use.

== Background ==

In the Java world there has been historically an incompatibility issue between the two major RDF toolkits: Apache Jena and OpenRDF Sesame. Many libraries have tried to wrap them, but besides technical considerations, they normally end up being unmaintained.

Together, we came up with the idea of Commons RDF for solving the incompatibility problem. The community has been in healthy development at GitHub, including participants from the major Java RDF toolkits.

The natural path to Apache Commons Sandbox has been studied, but we think that in this phase of the project, which focuses on the API design and actively involves the developers of existing toolkits, it is better to have a more focused community and infrastructure. Rather than a new Top-Level Project, the goal is still to graduate as part of Apache Commons, that is when API has achieve the required maturity and the project goes into maintenance mode.

Part of the motivation for doing the incubator process would therefore be to bring together the existing Commons RDF community in the Apache Way, mature the API, and then gradually prepare the Commons RDF community for working within the larger Apache Commons community.

== Rationale ==

The library comes from the need for providing a generic and neutral API for RDF 1.1 that everybody can transparently use without bounding the design to concrete implementations. It is the result of cooperation between contributors to the main Java toolkits, and will try to be available in a timely manner to influence the major version updates Jena 3.0 and Sesame 4.0.

== Initial Goals ==

== Current Status ==

The API is already quite agreed by all involved players, and it's been started to be prototyped, both by the major toolkits and by simple implementations.

=== Meritocracy ===

Commons RDF has been completely designed by committee using git workflows, where every single feature has been discussed based on a Pull Request. We plan to keep such methodology where the commons understanding comes first than personal decisions.

=== Community ===

Commons RDF addresses the developers who are working with Semantic Web technologies in the JVM. The initial committers are core contributors to that community.

=== Core Developers ===

=== Alignment ===

Commons RDF comes to help in the integration of the different ASF projects using RDF technologies, where Apache Jena can be integrated with others which use Sesame (Any123 and Marmotta). In addition, proposals by other projects (Clerezza and Stanbol) can be also aligned.

== Known Risks ==

=== Orphaned Products ===

Probably one of the major risks will that the API provided does not fit well in the development plan of the main Java toolkits. But we try to minimize such risk by having on board core developers of those framework, the API will live or die on its own merits.

=== Inexperience with Open Source ===

The committers have large experience with open source development and ASF communities.

=== Homogeneous Developers ===

The initial list of developers come from four different organizations and three different counties.

=== Reliance on Salaried Developers ===

Although the project is also in the strategic agenda of project of our current employers, so far the main development is happening at volunteer time.

=== Relationships with Other Apache Projects ===

The project really relates with Jena as one of the potential implementations, with Any23 and Marmotta which are based on Sesame, and Clerezza and Stanbol as project that may benefit of such common api.

=== An Excessive Fascination with the Apache Brand ===

While we expect the Apache brand may help attract more contributors, our interests in starting this project is based on the factors mentioned in the Rationale section.

== Documentation ==

Documentation for the current project can be found at GitHub: http://commons-rdf.github.io

== Initial Source ==

The current source code can be found at GitHub: https://github.com/commons-rdf/commons-rdf

=== Source and Intellectual Property Submission Plan ===

The whole copyright is hold by the four developers signing this proposal, all of them already with a ICLA with ASF in place. Current licence is already Apache Software License 2.0.

=== External Dependencies ===

All current dependencies have Apache compatible licenses, including MIT, BSD 3-clause, MIT and EPL.

=== Cryptography ===

Does Not Apply.

== Required Resources ==

=== Mailing lists ===

=== Repository ===

=== Issue Tracking ===

=== Other Resources ===

== Initial Committers ==

=== Affiliations ===

== Sponsors ==

=== Champion ===

=== Nominated Mentors ===

=== Sponsoring Entity ===

Apache Incubator PMC

afs commented 9 years ago

Overall looks good. The name may be a point fro discussion but with a possible route to Apache Commons for a long term place to live makes sense to at lest start the conversation with that.

The currently being discussed pTLP model might be appropriate (whether the exit is TLP or Commons) given the number of existing Apache committers. Again, something to discussion with IPMC.

Quiet, stable TLPs aren't in themselves a problem. As long as they can raise a vote-quorum if needed, and that might be similar in Commons as this compoent is less general than the set already there.

The plug-in dependencies don't matter much.

Small point: It might be easier to drop the Java6 branch for now, then retro fit it rather than incure the costs of maintaining it. Given the timescale, Java6 might be more legacy/existing, unchanging evironments where this code would not be used anyway.

wikier commented 9 years ago

OK, if everybody more or less agrees on the current draft, I'll start trying to wrap up the discussion at dev@commons.a.o, then maybe in a couple of days we can move the conversation at general@incubator.a.o.

I agree with dropping the Java6, better sooner than later.

wikier commented 9 years ago

So far the discussion at dev@commons.a.o is going well... We still need a champion, and as I said, I'd prefer a Commons PMC member, but we can go ahead with someone else.

If nothing happens, I plan to put the current draft of the proposal in the Incubator wiki and bootstrap the discussion in the mailing list.

stain commented 9 years ago

Overall the proposal looks good. Obviously there will be discussions.

Do we not try to get Reto to join this proposal? His views are valuable, even if some of them might be counter to what we think we have concluded already here. :-)

Should we not run it through Commons PMC first?

I know now the proposal says Incubator IMPC as the champion, not Commons PMC, which we got feedback from general and commons could work.

If we are sent straight for pTLP when we don't propose to become a TLP would be a bit ironic... But as you say a TLP doesn't need to be long-term developing. With 4 proposed committers we would really need to grow the community in that case.

A TLP will struggle with the name, presumably we can't call it Apache Commons RDF outside Commons. Apache RDF is too general, but perhaps something similar to HttpComponents did would work.

If we go TLP it would be a shame to miss out of Commons' pool of generic API folks who might have something to say about streams etc, but of course we can try to recruit them anyway. On 28 Jan 2015 15:40, "Sergio Fernández" notifications@github.com wrote:

OK, if everybody more or less agrees on the current draft, I'll start trying to wrap up the discussion at dev@commons.a.o, then maybe in a couple of days we can move the conversation at general@incubator.a.o.

I agree with dropping the the Java6, better sooner than later.

— Reply to this email directly or view it on GitHub https://github.com/commons-rdf/commons-rdf/issues/59#issuecomment-71855335 .

stain commented 9 years ago

+1, get the ball running instead of more discussion! :-) On 30 Jan 2015 08:16, "Sergio Fernández" notifications@github.com wrote:

So far the discussion at dev@commons.a.o is going well http://markmail.org/message/tnwjccsxb2rfiziy... We still need a champion, and as I said, I'd prefer a Commons PMC member, but we can go ahead with someone else.

If nothing happens, I plan to put the current draft of the proposal in the Incubator wiki and bootstrap the discussion in the mailing list.

— Reply to this email directly or view it on GitHub https://github.com/commons-rdf/commons-rdf/issues/59#issuecomment-72167854 .

afs commented 9 years ago

It would be nice to have a champion from Commons but not a deal breaker. At the moment Aapche Common looks like the best destination but we'll decide that at graduation. We could "sell" this incubator podling as being "easy" :-), not like some monster that has an IP history of many years, or a codebase that takes over 4 hours in ingest when it was supposed to be 45 minutes tops (that's what happens when you include 10y of cvs+svn history!).

wikier commented 9 years ago

Sure, feedback from @retog would be valuable, and I'd like to have in on board. So far I did not see feedback from him, then it'd be great to have it before moving forward, here or at dev@commons.a.o does not really matter.

I guess the final graduation path is something will naturally appears during incubation. I expect anyway a very easy podling, but we'll see...

stain commented 9 years ago

On 30 January 2015 at 08:51, Andy Seaborne notifications@github.com wrote:

It would be nice to have a champion from Commons but not a deal breaker. At the moment Aapche Common looks like the best destination but we'll decide that at graduation. We could "sell" this incubator podling as being "easy" :-), not like some monster that has an IP history of many years, or a codebase that takes over 4 hours in ingest when it was supposed to be 45 minutes tops (that's what happens when you include 10y of cvs+svn history!).

Hey hey Mentor, we're getting there! :) [1]

[1] http://mail-archives.apache.org/mod_mbox/taverna-dev/201501.mbox/%3CCAOkCRcBNdHrp5QhsJ_dvKa4-kVaDc-vW7JN4qRsweE2tr_gczQ%40mail.gmail.com%3E

Stian Soiland-Reyes Apache Taverna (incubating) http://orcid.org/0000-0001-9842-9718

retog commented 9 years ago

@sergio some feedback discussion is on rdf-commons list, currently on bnode identifiers and identity. A form of feedback is also https://svn.apache.org/repos/asf/commons/sandbox/rdf/trunk/, which tries to allign clerezza and github proposal and mentions some issues in Readme.

On Fri, Jan 30, 2015 at 1:39 PM, Stian Soiland-Reyes < notifications@github.com> wrote:

On 30 January 2015 at 08:51, Andy Seaborne notifications@github.com wrote:

It would be nice to have a champion from Commons but not a deal breaker. At the moment Aapche Common looks like the best destination but we'll decide that at graduation. We could "sell" this incubator podling as being "easy" :-), not like some monster that has an IP history of many years, or a codebase that takes over 4 hours in ingest when it was supposed to be 45 minutes tops (that's what happens when you include 10y of cvs+svn history!).

Hey hey Mentor, we're getting there! :) [1]

[1]

http://mail-archives.apache.org/mod_mbox/taverna-dev/201501.mbox/%3CCAOkCRcBNdHrp5QhsJ_dvKa4-kVaDc-vW7JN4qRsweE2tr_gczQ%40mail.gmail.com%3E

Stian Soiland-Reyes Apache Taverna (incubating) http://orcid.org/0000-0001-9842-9718

— Reply to this email directly or view it on GitHub https://github.com/commons-rdf/commons-rdf/issues/59#issuecomment-72201489 .

stain commented 9 years ago

Great that @retog is positive to the proposal!

So @wikier, perhaps you could modify the proposal to include @retog on the committer list and the commons/sandbox/rdf code as an additional starting point? I think it's ready to go into the Incubator wiki and sent to dev@commons and general@incubator.

@retog - your affiliation is Berner Fachhochschule BFH?

As I am getting to grips with the differences and their implications I think we need to work together (as an Apache project) on the more difficult issues like #56, local scope and blank node equality.

We need also to formalize a bit more on the philosophy of the project - what are the intended goals and users/implementors of this API - and what is out of scope - as it seems currently much of the conceptual background and reasoning are in historical issues and in people's head.

wikier commented 9 years ago

proposal submitted to the wiki https://wiki.apache.org/incubator/CommonsRDF later I'll prepare the mail to start the discussion with IPMC

retog commented 9 years ago

Thanks @stain. Yes my affiliation is Berner Fachhochschule BFH.

afs commented 9 years ago

Minor: The normal style nowadays is not to request a users mailing list, just use the dev list. It helps people get involved in the development. For small communities, it helps not to spread too thinly as well.

That makes good sense here because the long-term destination may well be Aapche Commons.

wikier commented 9 years ago

For this project I do agree. But I'd not put it as the "normal style nowadays"; I personally find the separation quite useful, even in small community.

Proposal updated.

stain commented 9 years ago

For the record: @wikier sent the CommonsRDF proposal to:

wikier commented 9 years ago

not that much feedback there anyway...

afs commented 9 years ago

Sometimes, it takes the weekend. And the incucbator reportis due at the moment so that takes people's time.

wikier commented 9 years ago

not that much so far... I'll await a couple of days more

wikier commented 9 years ago

this issue should be closed so long ago, project already being incubated at ASF: http://incubator.apache.org/projects/commonsrdf.html