jakartaee / persistence

https://jakartaee.github.io/persistence/
Other
196 stars 58 forks source link

JPA API 3.1.0 published on maven central is missing persistence_3_1.xsd #372

Closed jgrassel closed 2 years ago

jgrassel commented 2 years ago

Noticed that the JPA API 3.1.0 jar published on maven central (https://mvnrepository.com/artifact/jakarta.persistence/jakarta.persistence-api/3.1.0) has the orm_3_1.xsd file, but is missing the persistence_3_1.xsd file. (It's also missing in the github repo, and the spec doc itself has not been updated to use the 3.1 version for the persistence.xml section.)

dazey3 commented 2 years ago

Can confirm. https://mvnrepository.com/artifact/jakarta.persistence/jakarta.persistence-api/3.1.0

jakarta.persistence-api-3.1.0.jar:

jakarta\persistence\orm_2_2.xsd
jakarta\persistence\orm_3_0.xsd
jakarta\persistence\orm_3_1.xsd
...

jakarta\persistence\persistence_2_2.xsd
jakarta\persistence\persistence_3_0.xsd
...

Looks like jakarta\persistence\persistence_3_1.xsd is missing.

lukasj commented 2 years ago

Apart from the schema version number - what diff between 3.0 and 3.1 schema do you expect to see?

jgrassel commented 2 years ago

No differences. Except every single previous version of JPA has introduced a new schema version of persistence.xml, even when there hasn't been any changes to the schema itself. Is there a reason or rule we're breaking consistency here?

lukasj commented 2 years ago

2.2 (from JCP/Oracle) is the same as 2.3 (from Eclipse) and 2.3 schema as such does not exist; 3.0 changes the target namespace. The only questionable and avoidable version bump can be seen between 2.1 and 2.2 - there was no real change, the only difference was formatting of the file (it was forbidden to change already existing 2.1 schema)

jgrassel commented 2 years ago

I've never heard of a "JPA 2.3" from anywhere. I sure don't see its tag on the current jpa-api git repo:

jpa-api jgrassel$ git tag
2.2-2.2.3-RELEASE
2.2-3.0.0-RC1-RELEASE
2.2.1-RELEASE
2.2.2-RELEASE
3.0-3.0.0-RC2-RELEASE
3.0-3.0.0-RELEASE
3.1-3.1.0-RC1-RELEASE
3.1-3.1.0-RC2-RELEASE
3.1-3.1.0-RELEASE

Nor do I see a tag for it in the old javax.persistence repo (https://github.com/eclipse/javax.persistence). Where might I find this?

jgrassel commented 2 years ago

Anyways, is skipping the version increment of the persistence.xml version for releases that do not alter the schema now a thing? Is this practice consistent with the other Jakarta EE technologies?

lukasj commented 2 years ago

AFAIK it has never been a requirement to update schema in each spec version, the decision has always been left up to the spec itself and ie schemas in XML specs were not updated for EE 10 as well. To give an example of a spec project I’m not involved in and with no update to schema so far see connectors 2.1 (still in draft)

lukasj commented 2 years ago

As for the versioning - it was 2.2.0 in Java EE 8 and 2.2.3 (not 2.3) in Jakarta EE 8; sorry for confusion

jgrassel commented 2 years ago

Alright, I'll close the issue then.

dazey3 commented 2 years ago

@lukasj One thing I would add is that given we use the specification version as the schema version (2.1, 2.2, 3.0), it may be confusing to a developer that says "I want to use JPA 3.1".

That developer would create a persistence.xml, and may attempt to use a schema location of "persistence/persistence_3_1.xsd" and a version of version="3.1". After all, why wouldn't they? They would then receive an error during validation as a persistence.xml schema "3.1" doesn't exist. I could understand that users confusion given the assumed correlation between specification version and schema version. Also, given they use the same versioning and have always been incremented/released together in the past.

I understand that nothing has changed in the schema itself and a new version is not necessary. We know that internally, and whoever takes the time to actually investigate changes in JPA from one minor version to the next. However, the JPA specification as a whole is going from 3.0.0 to 3.1.0. I think we should take care to maintain consistent usage of the specification version when the specification version increments, even in peripheral associations. The average user doesn't care, they just expect things to match the effort is minimal on our part to make sure it does.

lukasj commented 2 years ago

given we use the specification version as the schema version

that is just a coincidence. 2.0 -> 2.1 and 2.2 -> 3.0 updates were about a namespace and license changes due to the change of owner of the specification - first Sun -> JCP/Oracle, then JCP/Oracle -> Eclipse Fdn. I don't think any of these changes/requirements were doable without version number change.

having 3.1 version of the schema would

dazey3 commented 2 years ago

that is just a coincidence

I understand that is just a coincidence but perhaps not to the average user. It's also apparent to us internally, and anyone who reviews the changes between specification releases, that nothing has changed to even warrant a schema increment. But to the laymen user, which I deal with plenty of customers who just blanket update all persistence/orm xml files during migration, it isn't just a coincidence. We have a burden of knowledge here and the effort, imo, is minimal to us for customers ease of understanding.

having 3.1 version of the schema would:

require someone to do the work on the spec project side

373

trigger work in existing projects (those on 3.0) willing to stay up-to-date for no gain

I could only find testing changes in EclipseLink, but I linked the PR above if you would like to point out places I missed. EclipseLink's org.eclipse.persistence.internal.jpa.deployment.xml.parser.PersistenceContentHandler that processes persistence.xml doesn't seem to do anything with the version. In fact, org.eclipse.persistence.internal.jpa.deployment.SEPersistenceUnitInfo.getPersistenceXMLSchemaVersion() has a TODO that this is not yet implemented.

Granted, there may be changes required to containers to support the 3.1 version, as the spec would demand it via section 9.1. I'm not sure how that can be avoided now that the 3.1 release was made... can we release a 3.1.1 of the spec? If containers want to move up to 3.1.1, they can, otherwise they can remain on 3.1.0 and continue not validating a persistence 3.1, as that version of the spec says

require vendors to implement support for the new version of the schema (they can still support it if they want the way they want; ie they can but are not required to treat 3.1 version as 3.0 with some warning printed out)

I don't mind if implementations want to treat 3.1 as 3.0 and are not required to print a warning. I only want users that define "3.1" and "persistence_3_1.xsd" not to be met with validation errors that these versions/files don't exist.

lukasj commented 2 years ago

In my mind seeing version number change triggers search for changes - in this case I find none -> wasted time

Not many changes in EclipseLink from what I could find (only testing).

it is NOT about eclipselink (or any other provider). It is about updating existing tck tests to use new schema; it is about users/customers having apps targeting 3.0 to move to the newer version of the schema (should they want to)

I only want users that define "3.1" and "persistence_3_1.xsd" not to be met with validation errors that these versions/files don't exist.

Does the spec explicitly require vendor to fail for validation errors in persistence.xml? Even if it does, isn't the decision about what to do - fail, continue, print warning, make coffee, whatever else :-) - up to the vendor?

can we release a 3.1.1 of the spec?

sure, everything is possible - but I'm afraid that nobody is able to handle it now given where EE 10 as such is

lukasj commented 2 years ago

But to the laymen user, which I deal with plenty of customers who just blanket update all persistence/orm xml files during migration

Why should we "require" them to do so? Shouldn't the goal be to not waste people/customer time by removing unnecessary burden during updates?

dazey3 commented 2 years ago

it is NOT about eclipselink (or any other provider).

fair enough, but EclipseLink is still the reference implementation, yes? Other providers are free to deviate from the specification as they see fit.

It is about updating existing tck tests to use new schema; it is about users/customers having apps targeting 3.0 to move to the newer version of the schema (should they want to)

I'm not sure I understand. A new "3.1" persistence schema would be backwards compatible with "3.0". There would be no mandate that in order to use JPA 3.1, you must update your schema versions. Similar to how using schema version "1.0" on JPA 2.1 was valid. Where did I say that existing 3.1 tests that use persistence schema 3.0 would be required to be updated?

Does the spec explicitly require vendor to fail for validation errors in persistence.xml? Even if it does, isn't the decision about what to do - fail, continue, print warning, make coffee, whatever else :-) - up to the vendor?

The container must validate the _persistence.xml_ file against the
__persistence_3_0.xsd__ or __persistence_2_2.xsd__ schema in accordance with
the version specified by the _persistence.xml_ file and report any validation errors.

The container must "report validation errors", but that report is given wide definition sure.

Certainly, I would advocate for the container I manage to just accept "3.1", but what if there is a new JPA spec? What if there is a new "JPA 3.2.0" specification that does require changes to the schema? What will we call that schema version? Will we call it "3.1" or just jump, coincidentally, to "3.2"? The container I manage would be in trouble if we started deviating from what the specification defines, making arbitrary assumptions.

lukasj commented 2 years ago

it is NOT about eclipselink (or any other provider).

fair enough, but EclipseLink is still the reference implementation, yes? Other providers are free to deviate from the specification as they see fit.

No, there is nothing like reference implementation in Jakarta EE. There is compatible implementation with slightly different requirements instead and no implementation is the right one - it either passes TCK and can claim compatibility or does not pass TCK and cannot do so. Exact requirements are defined within TCK itself.

Where did I say that existing 3.1 tests that use persistence schema 3.0 would be required to be updated?

Nowhere. Is that to imply that one should not care about having tests for new features? IMHO, with every schema update tests needs to be updated to use the latest version and some test for the "to become old" schema version needs to be created. Even if less laboring path is chosen, new test for handling new schema version needs to be added.

The container must "report validation errors",

thanks! how could I miss that. That means that this change would require an update of the spec document. Just for the record, I'm already waiting for spec doc update due to broken links there for 2+ years. Simply said the process for this sort of changes within Jakarta EE WG has not been defined yet

Certainly, I would advocate for the container I manage to just accept "3.1", but what if there is a new JPA spec? What if there is a new "JPA 3.2.0" specification that does require changes to the schema? What will we call that schema version? Will we call it "3.1" or just jump, coincidentally, to "3.2"? The container I manage would be in trouble if we started deviating from what the specification defines, making arbitrary assumptions.

Looking at https://jakarta.ee/xml/ns/jakartaee/, there are 5 (five) from 20 (or 24 if one counts jaxb/jaxws/validation as well) schemas expected to be updated for EE 10, that is some < 25% as of now.

few notes:

What if there is a new "JPA 3.2.0" specification that does require changes to the schema?

Spec versioning scheme is defined to be MAJOR.MINOR only. The process and scheme for "patch" releases of the spec document are TBD, but it is likely it will follow revisions (revA-revZ). API jar file and TCK bundle versioning is defined to be MAJOR.MINOR.MICRO to allow fixes and the process usually takes weeks/months to get API jar update out; it is way shorter to get TCK update out.

What will we call [new] schema version?

new schema version will correspond to the spec version

dazey3 commented 2 years ago

Where did I say that existing 3.1 tests that use persistence schema 3.0 would be required to be updated? Nowhere. Is that to imply that one should not care about having tests for new features?

I agree, there should be new tests for a "3.1" schema. More that the existing "3.0" tests would be migrated to "3.1" and a compatibility test would be added to verify "3.0" schema is still compatible. Given that "3.0" is what is being shipped currently, it would appear that latter test will pass.

Where can I add these tests? If the issue is the effort, then point me in the direction where I can put the effort in.

That means that this change would require an update of the spec document.

Let me know what I'm missing: https://github.com/eclipse-ee4j/jpa-api/pull/373

I assume you mean that a new PDF document will need to be generated then, yes? Would that not be true for any update requiring a 0.0.1 increment update? Things can be missing or mistake in the initial release and a new release be needed to correct that mistake.

new schema version will correspond to the spec version

interesting coincidence

lukasj commented 2 years ago

Where can I add these tests?

to the TCK, the repo is at https://github.com/eclipse-ee4j/jakartaee-tck/

If the issue is the effort, then point me in the direction where I can put the effort in.

All spec projects have to be moved from eclipse-ee4j org to jakartaee org (should have been done within EE 10 but came in late in the cycle) Persistence TCK tests should be moved out from platform tck repo to this one (or to the new repo managed by this project), jbatch or jsonp/b are examples of projects which did that in current release.

those are areas bringing in bigger value

I assume you mean that a new PDF document will need to be generated then, yes?

yes. Given https://jakarta.ee/about/jesp/ and

what the version number of the specification document is going to be?

Would that not be true for any update requiring a 0.0.1 increment update?

No. Ie wrong osgi header in the api jar does not require an update of the spec document, yet it requires a service release with the fix. The fix for this issue would also not require new TCK release. OTOH new release of Java SE may require new TCK service release as some tests can be failing there (was the case with SE 11) while not requiring API or spec doc updates.

Things can be missing or mistake in the initial release and a new release be needed to correct that mistake.

Sure. Is there any release without mistake? Is this that important to either release 3.2 right away (expect at least few months) or spend time on trying to define new process for doc updates?

new schema version will correspond to the spec version

interesting coincidence

As long as it is aligned with remaining >75% specs which are not updating schema in this release, what’s the point?

asbachb commented 1 year ago

So in the end that schema has it's own versioning detached from the release version? So there could be persistence 4.0 which still refers to 3.0 schema?

Actually I never thought about that the xml namespaces and the version releases does not match up.

I wonder if this is really a good idea. How should I know which namespaces matches with which jpa spec?

dmatej commented 1 year ago

So in the end that schema has it's own versioning detached from the release version? So there could be persistence 4.0 which still refers to 3.0 schema?

Actually I never thought about that the xml namespaces and the version releases does not match up.

I wonder if this is really a good idea. How should I know which namespaces matches with which jpa spec?

It is quite standard thing, in general you can also split the XSD to more files and each would have it's own versioning. That can happen ie. when you need to share some type definitions; then you can reuse generated classes, etc. From the opposite viewpoint, you can support more schemes in a single library based on the version, and again, to share classes between several "sets" based on different versions. Simply said, it is much more flexible.

asbachb commented 1 year ago

It is quite standard thing, in general you can also split the XSD to more files and each would have it's own versioning. That can happen ie. when you need to share some type definitions; then you can reuse generated classes, etc.

Yeah but it's a different use case and currently not the case with the schema we're talking about. If we'd multiple schemata and different versioning this issue would not be raised.

From the opposite viewpoint, you can support more schemes in a single library based on the version, and again, to share classes between several "sets" based on different versions. Simply said, it is much more flexible.

It's still not the use case here. I don't see a scenario where you adjust the schema, but not the spec beside bug fixing.

As you see that multiple people are confused by the versioning it should be taken more seriously. As it's part of the spec and part of the spec packaging I still don't see the point why it should have it's own dedicated dissenting version.

So what's the expectation for a developer? Check spec to see with namespaces should be added in each version? Do we want mapping matrix where we match which spec version matches to which namespace for multiple schemata? If it's really a dedicated entity why it's managed inside the spec project?