spring-projects / spring-data-jpa

Simplifies the development of creating a JPA-based data access layer.
https://spring.io/projects/spring-data-jpa/
Apache License 2.0
2.93k stars 1.39k forks source link

Suboptimal specification queries for object arrays #3412

Closed markusheiden closed 3 months ago

markusheiden commented 3 months ago

When using code to filter by an SQL array containing a given element, the null handling in the generated SQL breaks the check.

arrayfunction.zip

SQL generated for PostgreSQL (shortened):

SELECT ... FROM ... WHERE CASE WHEN longs IS NOT NULL THEN COALESCE(array_position(longs, 2), 0) END IS NOT NULL

The generated SQL has two flaws:

@Entity
public class MyEntity {
    // This automatically maps to an SQL array.
    @Column(name = "longs", nullable = false)
    private List<Long> longs;
    ...
}

public interface MyRepository extends Repository<MyEntity, Long> {
    List<MyEntity> findAll(Specification<MyEntity> spec);
}

public class Test
    void test() {
        ...
        var found = repository.findAll(Specification.where(hasAnyLong(2L)));
    }

    private Specification<MyEntity> hasAnyLong(Long l) {
        return (entity, cq, cb) -> {
            Path<MyEntity> longs = entity.get("longs");
            var position = cb.function("array_position", Integer.class, longs, cb.literal(l));
            return cb.isNotNull(position);
         };
    }
}
quaff commented 3 months ago

Returns the 1-based position of an element in the array. Returns 0 if the element is not found and null if the array is null.

You should use cb.ge(position, 1) instead of cb.isNotNull(position).

BTW, It's handled by Hibernate not Spring Data JPA.

markusheiden commented 3 months ago

Sorry for misjudging this.

Your suggested change works. Thanks for that. So this issue might be closed.

I used cb.gt(position, 0) though.

But for PostgreSQL, the documentation says: Returns the subscript of the first occurrence of the second argument in the array, or NULL if it's not present. So the NOT NULL comparison worked for us before. Now we need to switch from user types to the "native" array support of Hibernate due to a Hibernate bug. The native array support generates different SQL for this use case.

quaff commented 3 months ago

https://docs.jboss.org/hibernate/orm/6.4/userguide/html_single/Hibernate_User_Guide.html#hql-array-position-functions

The function array_position is hql function provided by Hibernate, It will overlay native sql function.

markusheiden commented 3 months ago

Thanks for pointing me to the docs. Before I thought, these function calls were calls of native functions of the database.

Then it might be another Hibernate bug because it worked before when using user types to map to SQL arrays. In that case array_position was called directly with no adaption needed to provide the documented behavior.

quaff commented 3 months ago

Then it might be another Hibernate bug because it worked before when using user types to map to SQL arrays. In that case array_position was called directly with no adaption needed to provide the documented behavior.

I guess that works before array_position added, it's introduced since 6.4.0, see https://hibernate.atlassian.net/issues/HHH-17335