Entity listeners are not fired when a collection has been updated

lukasj commented 6 years ago

I'm having a Product entity, which has a manufacturers attribute:

@ElementCollection(targetClass = java.lang.String.class, fetch = FetchType.EAGER)
--
@CollectionTable(name = "manufacturers", joinColumns = @JoinColumn(name = "man_id"), indexes = { @Index(columnList = "man_id") })
@Column(name = "manufacturer")
private Collection<String> manufacturers;

I also have a ProductListener with two @PreUpdate and @PrePersist methods. However none of these methods are triggered when I update the collection of manufacturers. Either we need to trigger the event when the collection is updated, or add some new annotations like @PreCollectionUpdate and @PreCollectionPersist (hibernate style).

lukasj commented 6 years ago

Issue Imported From: https://github.com/javaee/jpa-spec/issues/167
Original Issue Raised By:@ptahchiev
Original Issue Assigned To: Unassigned

lukasj commented 6 years ago

@olivergierke Commented

However none of these methods are triggered when I update the collection of manufacturers.

Even if you call ….merge(…) explicitly?

lukasj commented 6 years ago

@ptahchiev Commented Yes, even if you call merge explicitly. If the specification says they must be fired, then I guess it's a bug in Hibernate. However, hibernate have native events to handle these cases, so I don't believe it's a bug there.

lukasj commented 6 years ago

@olivergierke Commented Did you file a ticket against Hibernate then?

The spec says:

The PreUpdate and PostUpdate callbacks occur before and after the database update operations to entity data respectively.

I guess it depends on what "entity data" is interpreted to be but I'd read this as "the entity and its declared relationships" which looks like it could be fixed in Hibernate alone. Might be worth investigating how other JPA implementations behave.

lukasj commented 6 years ago

@s4ke Commented I've seen this on the mailing list of Hibernate, but I guess this here, is the better point to contribute:

From my experience with EclipseLink and Hibernate (and if i remember correctly OpenJPA), JPA implementations differ in terms of how these annotations are interpreted.

I have written my Bachelors Thesis about Hibernate Search with a generic JPA implementation and I analyzed this problem in more detail because I had to investigate how to keep a search index up-to-date.

(see section 7.1.1 in https://raw.githubusercontent.com/s4ke/Bachelor-Thesis/Update-System-rework/tex/src/bachelor_thesis.pdf)

Firstly, not all JPA providers seem to handle these events similarly: For example Hi- bernate ORM doesn’t propagate events from collection tables to the owning entity, while EclipseLink does (EclipseLink’s behaviour would be needed from all providers).

Secondly, we find [...] that the events are triggered on flush instead of commit as can be seen in listing 34. This is an issue if the changed data is not actually commited[.]

This was with Hibernate 4.3.9 at the time, so this might have changed and I might have missed something, but I guess this might help.

lukasj commented 6 years ago

@olivergierke Commented I guess it'll be helpful to have a small reproducing project that shows Hibernate showing the allegedly erroneous behavior. As I read the spec, it's entirely irrelevant which tables are affected of a change eventually. The dirty-checking of the entity contained in the persistence context has to trigger the lifecycle callback on the root entity, just as it eventually triggers the update in the related table.

I.e. I don't necessarily think this even needs a change in the spec beyond a TCK test that tightens the checks under which conditions the events are fired. The aforementioned sample project could actually just be one to be included into the TCK.

lukasj commented 6 years ago

@sebersole Commented I'm surprised you have such a strong opinion about an obvious "interpretative" section of the spec Oliver. At any rate, we have our reasons and our beliefs and the spec does not explicitly say we are right or you are right or some other 3rd opinion is right. That's the point - its unclear. You have your opinion, and that's fine... but you don't just get to add tests to the TCK based on one's opinion.

lukasj commented 6 years ago

@olivergierke Commented

I guess it depends on what "entity data" is interpreted to be but I'd read this as "the entity and its declared relationships"…

…allegedly erroneous behavior.

As I read the spec…

There you go. Did I claim that I am authoritatively deciding anything? I deliberately wrote "allegedly erroneous" because of course the Hibernate team will have based their implementation on their interpretation of the spec and will be able to explain why it came to that conclusion, even if the conclusion is "the TCK didn't reject it". Can you maybe point to resources that explain your apparently different interpretation?

Nowhere did I speak about a "bug" in Hibernate. On the contrary I suggested to investigate how other implementations behave. And I suggested to create a reproducing project – something you usually and understandably get asked for when reporting a ticket in Hibernate - so that we can discuss a concrete example, not just some "I have this, it doesn't work". What's wrong with that? And what's wrong with having an opinion?

…but you don't just get to add tests to the TCK based on one's opinion.

Nobody claimed that to be the case.

lukasj commented 6 years ago

@sebersole Commented For me personally, section 4.3 Abstract Schema Types and Query Domains is a more specific definition of an entity's "state" as being "non-relationship" state:

Informally, the abstract schema type of an entity or embeddable can be characterized as follows:

For every non-relationship persistent field or get accessor method (for a persistent property) of the class, there is a field (“state field”) whose abstract schema type corresponds to that of the field or the result type of the accessor method.

For every persistent relationship field or get accessor method (for a persistent relationship property) of the class, there is a field (“association field”) whose type is the abstract schema type of the related entity (or, if the relationship is a one-to-many or many-to-many, a collection of such).

As for the rest... its not worth getting into a hissy match with you.

lukasj commented 6 years ago

@olivergierke Commented Thanks for the pointer, Steve. Isn't that quote exactly proving my point? @ptahchiev is using @ElementCollection, not one of the …-to–… annotations. So even following that particular definition, his manufacturers field has to be considered entity state, hasn't it? And – following that – shouldn't @PreUpdate be fired for a change of that state?

lukasj commented 6 years ago

@sebersole Commented I don't understand how that is the logical conclusion you come to. To me (even regardless of how you interpret element-collections in regards to "non-relationship"), one of the main uses of this state field distinction is to qualify what can be used in an JPQL UPDATE query in its "set items". An element-collection cannot be used in that way.

But again, this all just keeps reaffirming what I said... that this is unclear in the spec :)

To me the proper solution would be the addition of collection-based events in the spec.

lukasj commented 6 years ago

@olivergierke Commented I don't necessarily understand why a section from the query language spec is quoted when it comes to entity state and lifecycle, both of them defined in earlier sections of the spec (entities and what makes them up in 2.2, lifecycle callbacks in 3.5). Nobody is issuing a query at all. It's about a state change on a managed entity instance. For the topic under discussion, the spec wouldn't even need a section 4.

I'd argue you can only deliberately decide not to trigger those events for changes to @ElementCollection properties, if there's mandate in the spec for that. So far, we couldn't find a section that actually justifies that, so I guess I'm just going to go ahead and make the case to augment the section on lifecycle callbacks to be more precise. Especially, as the reference implementation apparently already follows my interpretation.

lukasj commented 6 years ago

@sebersole Commented You don't think reused phrases in a spec influence how each usage is understood... hmm...

So can I also call System#exit from within a JPA provider? The spec does not explicitly disallow it...

lukasj commented 6 years ago

@olivergierke Commented Again: the original reporter's use case doesn't have to do anything with JPQL, which means that there's no need to discuss that.

lukasj commented 6 years ago

@sebersole Commented I'm not discussing JPQL. I am discussing the phrase "entity state", which, you know, happens to be clarified in one place - a section on JPQL. It really does not matter if the clarification happened in Appendix Z.

And "again" - the point is not that you are right or that I am right. In fact you seem the only one arguing that you are right. I have pretty consistently said simply that neither of us is wrong because the spec is not clear. And yes, yes... you think it is clear - but the very fact that we are having this discussion and you cannot absolutely, 100%, definitively point me to part(s) of the spec that say my interpretation is wrong is in fact the definition of unclear.

lukasj commented 6 years ago

@olivergierke Commented

I'm not discussing JPQL.

You are if you're citing from the section defining it to justify your interpretation and thus justify why Hibernate's current behavior is supposed to be considered spec compliant.

…which, you know, happens to be clarified in one place - a section on JPQL.

No need to be more passive aggressive. "The persistent state of an entity…" is defined in 2.2 Persistent Fields and Properties. Basic rules of axiomatic definition define that a term has to be defined before it can be used. Thus, unless an explicit forward reference is used, a concept used in 3.5 needs to be defined before 3.5. There is no explicit forward reference pointing to JPQL used in 3.5.

And "again" - the point is not that you are right or that I am right. In fact you seem the only one arguing that you are right. I have pretty consistently said simply that neither of us is wrong because the spec is not clear. And yes, yes... you think it is clear - but the very fact that we are having this discussion and you cannot absolutely, 100%, definitively point me to part(s) of the spec that say my interpretation is wrong is in fact the definition of unclear.

No, that's completely backwards. There is no "you are right, I am wrong" no matter how many times you like to repeat that. There is "valid according to the spec" and "invalid according to the spec" and "undecided because the spec is too vague". There's a reference implementation that behaves the way I think is correct and you claim that's not an opinion that's justified. I ask for proof of why you think this is wrong and you start picking references from unrelated things (JPQL) to back your claims. Facing some push back on that due to the fact that you cite unrelated sections of the spec, you turn this into an "I am right, you're wrong discussion". I don't think I want to follow this kind of trolling anymore.

I hope we can agree on the fact that the spec could be more precise on the aspect of lifecycle callbacks, which I've never disputed but – in fact – was the starting point of my recommendation on how to proceed. All I am arguing is that what Petar is asking for is well in the ballpark of what the spec already defines, which I gave references for and the reference implementation even already implements.

lukasj commented 6 years ago

@s4ke Commented I think, given how different the JPA implementations are are regarding these annotations, and given how the spec is interpreted differently, the best course of action would be to not introduce any breaking changes to existing ORMs and their respective eco-systems.

JPA enables portability, but I think in areas where the specification is vague, existing user code should not be broken. In fact, I think a better idea would be to add new annotations along the lines of @PreCollectionUpdate as suggested in the initial posting - and clarify the specs (and the java docs) in general. This would help everyone as I think that firing @PreUpdate for every change in a collection might be called quite frequently without being relevant to user code (such as index-updating and custom event listening code). And while we're at it, a mechanism to register event handlers programmatically on the EntityManagerFactory would be nice too. I know that I would appreciate this kind of fine granular control.

EDIT:

And to clarify. In my quote above I was not talking about what I think the specification says. The sentence was talking about what would be required were this system used to update a fulltext-index.

Firstly, not all JPA providers seem to handle these events similarly: [...]

lukasj commented 6 years ago

@andyjefferson Commented FWIW DataNucleus JPA fires PreUpdate when a non-relation Collection/Map/array field is updated, since that field's data belongs to the owning entity (only), and an "EntityListener" relates only to entities not tables (clue in its name). The TCK is to reinforce what the spec means (if it even says anything of note, which it doesn't in many many places), but the TCK is private, so it's all pointless arguing. Enjoy.

lukasj commented 6 years ago

@rbygrave Commented

not introduce any breaking changes to existing ORMs and their respective eco-systems

This is a very good point.

So yes I believe this discussion is perhaps a bit moot but my question / thought would be.

Q: Are there not the 3 cases to consider:

Some properties on the entity are changed AND the element collection is changed
ONLY some properties on the entity are changed (but the element collection is not changed at all)
ONLY the element collection is changed (and no other persistent properties are changed).

If so, does the question then become ... is there value in being able to differentiate those events?

To me it seems that having a @PreCollectionPersist provides the ability to differentiate those events. For people who don't care/desire to differentiate those events they just invoke a common method from both of those events?

m-reza-rahman commented 3 years ago

This seems like low-hanging fruit that could be looked at in the next release and resolved one way or the other? One option is of course just to leave the ambiguity alone and close this.

Reza Rahman Jakarta EE Ambassador, Author, Blogger, Speaker

Please note views expressed here are my own as an individual community member and do not reflect the views of my employer.

dwhitla commented 2 years ago

A case which @rbygrave missed is where only collection membership changes - ie no attribute of the dependent object was modified. This will currently only be caught by an entity listener in the case where a many-to-many relationship is modelled using an explicit join entity. For this scenario the most appropriate handling would be the addition of collection change handlers as suggested in the first post. This solution is backward compatible with all previous JPA iterations.

trajano commented 1 year ago

I was wondering if we can close this ambiguity in the spec.
Other references: https://stackoverflow.com/questions/65117091/when-updating-an-elementcollection-does-it-trigger-the-postupdate-of-the-contai

But rather than new annotations can we repurpose the existing @PreUpdate and @PrePersist so it has attributes that represent the same functionality as @PreCollectionUpdate and @PreCollectionPersist

trajano commented 3 months ago

Here is another example https://stackoverflow.com/questions/78634001/is-postupdate-supposed-to-be-called-when-youre-modifying-data-in-a-one-to-many and this is with @PostUpdate specifically.

gavinking commented 3 months ago

Wow, just read properly the (very ancient) long discussion above. And I went and had a look at the spec.

It seems to me that @sebersole was 100% correct in what he said back then, and that this is indeed highly ambiguous. The whole of what the spec says about @Pre/PosUpdate events is this:

The PreUpdate and PostUpdate callbacks occur before and after the database update operations to entity data respectively. These database operations may occur at the time the entity state is updated or they may occur at the time state is flushed to the database (which may be at the end of the transaction).

And the term "entity data" is never defined, despite what is claimed in some of the discussion above. In particular, section 2.2 never uses the term, contrary to claims made above.

Nor is its meaning obvious:

Does the foreign key of a @OneToMany count as "entity data" of the entity to which the collection belongs?
Does the association table of an unowned @ManyToMany association count as "entity data" of the unowning entity?
In particular, does remove()-ing an entity which belongs to a collection result in @XxxUpdate events for the entity on the other side of the association?
What about unowned @OneToOne associations?

It's also disappointing that the spec glosses over the fact that @PreUpdate and @PostUpdate have a fundamentally different nature to @XxxPersist and @XxxRemove, the latter relating to very explicitly-defined API operations.

I was wondering if we can close this ambiguity in the spec.

@trajano I think JPA 4.0 could a reasonable moment to do that, yes. (I would not have wanted to do it in a minor release, due to the breakiness of such a change.)

So I think what I would maybe do is something like:

Introduce @PreInsert, @PostInsert + @PreDelete, @PostDelete events to match @PreUpdate and @PostUpdate (this is going to be important when we also introduce StatelessEntityManager)
Specify that owned collections form part of the "entity data" of an entity, but that unowned collections don't.
Say the same thing about @OneToOne associations.

jakartaee / persistence

Entity listeners are not fired when a collection has been updated #167