peichhorn / lombok-pg

Collection of lombok extensions
http://peichhorn.github.com/lombok-pg/
Other
326 stars 44 forks source link

Support for all JPA hashCode/equals patterns #115

Closed fommil closed 5 years ago

fommil commented 11 years ago

There are several patterns for hashCode/equals in Hibernate/JPA.

https://community.jboss.org/wiki/EqualsAndHashCode

Lombok supports "no eq/hC at all" and "eq/hC with the id property", but not "eq/hC with business key".

The first part of this RFE is "Please support eq/hC with business key".

My understanding of the business key pattern is that getters for Collection fields should return read-only versions, with an additional "add$Element" method. All setters would also update the "business key" field. e.g. http://pastie.org/4358366

In many ways, this could also be read as "Please support precomputed eq/hC".

Lombok also supports another pattern, which is to do eq/hC as "normal" - this invokes subtle bugs with JPA, e.g.:

http://stackoverflow.com/questions/11604370

Basically I had an @Entity which had a Collection<@Entity> and they each had a Collection. The calculation of hashCode was calling the hashCodes further down the chain, which resulted in a PersistenceException because Hibernate swaps Collection fields with proxies that cannot be used until the data is all read in.

A simple fix for this is to rewrite hashCode for @Entitys to never use any Collection fields, but that is a bit of a hack because it could result in hashCode collisions if there are not enough other fields in the object to distinguish instances (not a problem for me, though).

But in any case, another RFE is to ask: "Please allow the exclusion of fields from hashCode, keeping them in equals".

fommil commented 11 years ago

BTW, a 3rd RFE is that if the user @Overrides hashCode, then still generate the equals method.

My workaround for the Hibernate-related bug means I have to generate both because Lombok will not generate equals if I create a hashCode.

fommil commented 11 years ago

Hmm, just a thought, but if there were a way to wrap the Collection fields with an ObservableCollection, then Lombok could add code to listen to add/addAll/put calls and update the business key appropriately. That would negate the requirement to create new set methods.

The Collections would only be wrapped when released from the getter, but: (a) no such standard wrapper exists and (b) registering a listener with the wrapper would stop the wrappers from being garbage collected.

peichhorn commented 11 years ago

I need to think about this one really hard. For example I'm not the biggest fan of the following requests:

BTW, a 3rd RFE is that if the user @Overrides hashCode, then still generate the equals method.

But in any case, another RFE is to ask: "Please allow the exclusion of fields from hashCode, keeping them in equals".

But if I can't think of a better solution I might implement them.

Here is an article that presents a different approach: http://www.devx.com/Java/Article/30396/1954

fommil commented 11 years ago

OK, no problem - the workaround is to just manually (Netbeans or delombok) write the hashCode and equals. Although an "excludeFromHashCode" variable in @EqualsAndHashCode would have been nice ;-)

The real RFE is to support the "business key" pattern from the URL I linked: it is actually more of a "pre-generated identity key" than anything else. Note that talking about a "business key" is ambiguous, because some people also use the term to refer to a column with a uniqueness constraint on it (e.g. in the doc you have referenced, the user's e-mail address is the business key). In the latter case, Lombok already supports this.

From what I understand, the author of the article you have linked is advocating "eq/hC with the id property", but letting the DAO create the ID on object creation rather than on first DB persist. In terms of Lombok, this is already supported with @EqualsAndHashCode(of="id") plus the custom DAO he offers. I'm not all too keen on the author's approach because he is solving a very specific problem - which is to allow multiple newly created (but not yet persisted) entities to be stored in a Set with id being used for all comparison operations. This can also be solved by saying that an entity can never equal anything if id == null in equals (unless this == other).

In contrast, I (and many others) actually want persisted @Entitys to behave like normal java objects outside of the Transaction, with all fields to be compared during equals. In order to pull this off, EAGER fetch has to be used so that the full object is recovered from the DB before the Transaction is closed - which means that hashCode needs to be carefully constructed if nested Collections are involved.

fommil commented 11 years ago

(BTW, although heavyweight, the solution in http://www.artima.com/lejava/articles/equality.html for the subclass problem is nice. But why does canEqual need to be public? Isn't protected more appropriate?)

fommil commented 11 years ago

Caveat: the updating "business key" pattern shown in the first post, in the pastie link, is not my own. It came from somebody who helped me on the Hibernate IRC list. I had never seen this pattern before and just thought it might be useful. After further consideration, I'd like to see more documentation and discussion on use of the pattern, as it appears to be non-standard. One obvious problem is that it assumes that the hashCode can do the job of equals and maybe it can't. Although there is the assumption that anything held in the DB will be made out of primitive object types and perhaps that means hashCode can be used in this way.

fommil commented 11 years ago

I'm actually starting to double-guess this pattern after all. I'm asking the Stack Overflow community if there are any gotchas

http://stackoverflow.com/questions/11730379

peichhorn commented 11 years ago

In order to pull this off, EAGER fetch has to be used so that the full object is recovered from the DB before the Transaction is closed - which means that hashCode needs to be carefully constructed if nested Collections are involved.

Is a deep hash code and deep equals calculation really necessary?

(BTW, although heavyweight, the solution in http://www.artima.com/lejava/articles/equality.html for the subclass problem > is nice. But why does canEqual need to be public? Isn't protected more appropriate?)

lombok uses this exact approach to generate a proper equals method. I can live with canEqual being public, but to be honest I never thought about the visibility of this method long enough. ;)

fommil commented 11 years ago

OK, I'm starting to see this more clearly now. I'm no longer convinced that this is "best practice" for @Entity management anymore. It's not without its merits, but it would need a lot more testing and advocates before Lombok could confidently implement the pattern. I think the pattern would actually be "cached hashCode with JavaBean equals" rather than using the key for equality.

With regards to @Entity equality, I am not actually convinced that Lombok does support standard id based hashCode/equals, because of a few subtleties. For example, before I discovered Lombok, this is an example of the equals/hashCode I would use for @Entitys:

    @Override
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }
        if (!(obj instanceof THIS_CLASS) || id == null) {
            return false;
        }
        final THIS_CLASS other = (THIS_CLASS) obj;
        return id.equals(other.id);
    }

    @Override
    public int hashCode() {
        Preconditions.checkNotNull(id, "id must be set before @Entity.hashCode can be called");
        return id.hashCode();
    }

See Preconditions from Guava.

People like to point out that the hashCode changes when first persisted to the DB (the id is set), so the id should be set in user space on construction.

http://onjava.com/pub/a/onjava/2006/09/13/dont-let-hibernate-steal-your-identity.html

(I don't understand why he checks obj == null because the instanceof check will already do that for him)

So, I'm changing the RFE :-)

Please support JPA id based equals/hashCode such that a null id in this will return false from equals and raise an exception from hashCode.

Example syntax: @EqualsAndHashCode(jpaId = "id")

I think that then supports people who want to create the id themselves (e.g. like in the article you referenced and in this 2006 article - note that page 3 has a really nice and succinct way of doing it) and those who are happy with the DB creating the id.

People happy to use String UUIDs as keys can do the following in Java 5+

    @Id
    private String id = UUID.randomUUID().toString();

or

    @Id
    private UUID id = UUID.randomUUID();

It would be useful if Lombok could auto-generate an additional method to perform the role of the traditional equals, e.g. (note that id is excluded) from @EqualsAndHashCode(jpaId = "id", jpaPropertiesEqual="propertiesEqual") create the above plus

    public boolean propertiesEqual(Note other) {
        Preconditions.checkNotNull(other);
        if (this == other) {
            return true;
        }
        return Objects.equal(source, other.source)
                && Objects.equal(title, other.title)
                && Objects.equal(tags, other.tags)
                && Objects.equal(contents, other.contents);
    }
fommil commented 11 years ago

@peichhorn did you manage to pick up my last message? Essentially, this should be a lot easier than originally requested.

Maaartinus commented 11 years ago

@fommil I can imagine that propertiesEqual is useful, but the equals and hashCode above are too strange and not general enough. Somebody can use long id with 0or -1 standing for unset. However, for me it's an anti-pattern. Use Hi/Lo algorithm, assign id in the constructor, make it final, and use EqualsAndHashCode(of="id").

fommil commented 11 years ago

@Maaartinus, I'm surprised that you're calling this an anti-pattern. This is the recommended way to implement eq/hc for JPA entities.

You're right, some people might prefer to use a more elaborate system for generating the 'id' value so that bit should definitely be left up to the user. I'm not proposing that Lombok creates the id field (UUID is my choice). Specifically, I don't see any reason why a hi/low generator can't be used with this pattern.

In this template, the general mechanism for calculating eq/hc doesn't assume any type information other than that they are immutable objects: these implementations work wether you use an Integer, Long, UUID, String or something crazy. AFAIK, raw primitives are discouraged but in any case that is a reasonable requirement to place on users of such a JPA eq/hc to get Lombok advantages.

propertiesEqual is the non-standard bit that I thought people would say was a step too far :-)

Sam

On 18 Oct 2012, at 12:01, Maaartinus notifications@github.com wrote:

@fommil I can imagine that propertiesEqual is useful, but the equals and hashCode above are too strange and not general enough. Somebody can use long id with 0or -1 standing for unset. However, for me it's an anti-pattern. Use Hi/Lo algorithm, assign id in the constructor, make it final, and use EqualsAndHashCode(of="id").

— Reply to this email directly or view it on GitHub.

Maaartinus commented 11 years ago

You could use hi/lo with this pattern, but the point of hi/lo is AFAIK to get rid of this pattern; mutable id is EVIL.

Moreover, with hi/lo you need no Objects as id, simple long suffices. Compared to UUID this saves 4 bytes per column and at least 32 bytes memory. It also saves an indirection (and maybe a cache miss).

OTOH, propertiesEqual could be useful in case you use the "normal" EqHC and want to prevent creating a duplicate object having the same properties.

fommil commented 11 years ago

Maartinus, read the RFE again: you've not got to the conclusions ;-) there is no mutable id or hashCode here.

Kind regards, Sam Halliday

Sent from my iPhone

On 21 Oct 2012, at 07:38, Maaartinus notifications@github.com wrote:

You could use hi/lo with this pattern, but the point of hi/lo is AFAIK to get rid of this pattern; mutable id is EVIL.

Moreover, with hi/lo you need no Objects as id, simple long suffices. Compared to UUID this saves 4 bytes per column and at least 32 bytes memory. It also saves an indirection (and maybe a cache miss).

OTOH, propertiesEqual could be useful in case you use the "normal" EqHC and want to prevent creating a duplicate object having the same properties.

— Reply to this email directly or view it on GitHub.

fommil commented 11 years ago

@Maartinus, for the avoidance of doubt here is how hilo could be implemented with hibernate extensions if Lombok had this annotation:

@Data
@EqualsAndHashCode(jpaId = "id")
@Entity
public class MyEntity {

    @Id
    @GeneratedValue(generator = "test-hilo-strategy")
    private Long id;
}

(you could be more verbose, avoid @Data, and use your own @Getters and @Setters if you really wanted to avoid exposing setId()).

or if you didn't trust Hibernate to do the computation (or want non-hibernate support)

@Data
@EqualsAndHashCode(jpaId = "id")
@Entity
public class MyEntity {

    @Id
    private Integer id = MyIdComputer.newId();
}

you could even make id final here if you wanted to be sure that clients (who don't have access to reflection) didn't change your id but that is highly likely to cause problems with the JPA backend. Besides, AFAIK changing the id and then attempting an UPDATE will give you a PersistenceException unless you've engineered it to overwrite another row.

You're implying that hilo is the only show in town: it isn't. Other id strategies still have their merits. Indeed, JPA2 has opened up @GenericGenerator to allow arbitrary strategies.

Additionally, If I want to use UUID as my id, I skip the generation annotation and set it manually.

@Data
@EqualsAndHashCode(jpaId = "id")
@Entity
public class MyEntity {

    @Id
    private UUID id = UUID.randomUUID();
}

If you wanted, you could just use the existing @EqualsAndHashCode implementation and do something like this:

@Data
@EqualsAndHashCode(include = "id")
@Entity
public class MyEntity {

    @Id
    @GeneratedValue(generator = "test-hilo-strategy")
    private Long id;
}

But then what happens if the generation strategy returns null (a bug in the generator) or id only gets its value on DB commit? I'll tell you: all your entities would be equals (until committed to the database), at which point id and hashcode really do mutate (and the generation bug is always caught at first DB commit, because the primary id cannot be null.).

The alternative equals/hashCode that I've proposed here encourages immutable hashCodes and early setting of id (i.e. during construction, not commit).

You can't stop clients from instantiating their own MyEntity instances, because the JavaBeans pattern demands that a public no-arg constructor exists.

AFAIK, it is impossible to enforce true immutability on any field in a JPA @Entity (even if the backend allows final, what is to say that the DB admin doesn't manually update the row?). Attempts to mutate things that shouldn't be changed usually lead to PersistenceException. The JPA backend always needs to be able to instantiate and change fields, which means @Entitys have to implement the fundamentally mutable Javabeans pattern.

Incidentally, mutating the id might be a shorthand for copying a row, but it must then be COMMITed, not UPDATEd. It certainly shouldn't overwrite the existing row.

One last point: you could use primitives if you want. Given the overhead of surrogate entities and so on, I don't think saving a few bytes is going to make any real difference: long vs Long. I'm aware of the overhead in using UUID: I've profiled and it doesn't matter for a current project. Using a primitive, you'd lose the ability to detect when id hasn't been set, unless you have a special primitive value like -1. If you can tell me about a standard method that has this signature, I'll be happy to use Long if it makes you happy:

@Id
private Long id = MyGenerationStrategy.newId();