spring-projects / spring-data-jpa

Simplifies the development of creating a JPA-based data access layer.
https://spring.io/projects/spring-data-jpa/
Apache License 2.0
3.01k stars 1.42k forks source link

Pageable sorting by join property excludes null matches [DATAJPA-252] #664

Closed spring-projects-issues closed 11 years ago

spring-projects-issues commented 12 years ago

Shelley J. Baker opened DATAJPA-252 and commented

When using the PagingAndSortingRepository with a Pageable query, attempts to sort by the join property will exclude entities where the join condition is absent.

From JSR-317 4.4.5.2:

LEFT JOIN and LEFT OUTER JOIN are synonymous. They enable the retrieval of a set of entities where matching values in the join condition may be absent.

According to standard SQL and JPA specs, the left outer join should always contain all entities from the "left" table even if the join condition is absent. Therefore, the exclusion of the "left" entities when there is no matching value in the "right" table seems incorrect.

For example, given an entity "Person" with optional OneToOne relationship "Address" (where Address is the owner), any attempts to retrieve pages of Person entities from the repository that are ordered by an Address property will only return Person entities that have non-null Addresses. This is only a problem for Pageable and Sort queries; using the same @Query with a non-pageable/sort works as expected.

// Includes Persons with no Addresses and sorts by address.city, as expected
@Query("select p from Person p left outer join p.address address order by address.city")
List<Person> findPersonOrderByCity();

// Excludes Persons with no Addresses when sorting by an address property
@Query("select p from Person p left outer join p.address address")
Page<Person> findPerson(Pageable pageable);

// Excludes Persons with no Addresses when sorting by an address property    
@Query("select p from Person p left outer join p.address address")
List<Person> findPerson(Sort sort);

The resulting core sql statements (simplified slightly for brevity) are as follows; notice the additional FROM clause incorrectly added to the pageable query:

@Query("select p from Person p left outer join p.address address order by address.city")
List<Person> findPersonOrderByCity();
select p.id, p.first_name, p.last_name from person p left outer join address a on p.id=a.person order by a.city

@Query("select p from Person p left outer join p.address address")
Page<Person> findPerson(Pageable pageable);
select p.id, p.first_name, p.last_name from person p left outer join address a on p.id=a.person, address a2 where p.id=a2.person order by a2.city asc

A full test case is attached. This has been tested with spring-data-jpa-1.1.1.RELEASE and hibernate-4.1.5.SP1


Affects: 1.1.2, 1.2 RC1

Attachments:

2 votes, 4 watchers

spring-projects-issues commented 12 years ago

Oliver Drotbohm commented

I fear we have to forward this to the Hibernate bug tracker as we essentially call em.createQuery(…) and setFirstResult(…) / setMaxResults(…) on it for manually defined queries. We're not messing with the query at all. Would you mind opening a ticket against Hibernate and reduce the test case to use the plain JPA API? The resulting queries should be the same

spring-projects-issues commented 12 years ago

Shelley J. Baker commented

I've reduced the testcase to using only hibernate/jpa, but the result sets come back as expected even when setting the query's first result and max results, so the problem is not with hibernate.

After a bit more debugging into spring data jpa, it looks like the problem may be that the Sort column is always prefixed with the alias of the primary entity being retrieved. In other words, when QueryUtils applies the sorting, it results in the following query:

select p from Person p left outer join p.address address order by p.address.city asc

instead of the desired query that utilizes the join alias:

select p from Person p left outer join p.address address order by address.city asc

Is there any way to prevent prefixing the order by column, or to perform such a query in spring-data-jpa without having to implement our own pagination support?

spring-projects-issues commented 12 years ago

Fischer Matte commented

We have the same issue with spring data in our application. Whenever we sort over a ManyToOne property, rows with a null value in that column are filtered. When sorting over another column, they are included again.

spring-projects-issues commented 12 years ago

Oliver Drotbohm commented

We're generating order clauses according to the JPA specification section 4.9. We could slightly change the behavior to the following without breaking too much of the JPA compatibility:

If the sorting path (address.city in you case) is a valid property path on the entity, we prefix it with the entity alias. If not, we don't.

This would still not solve your special use case here as the sample you show accidentally uses and alias for the joined property that is also a property of the base entity (address). But I guess you could easily change that to something like

select p from Person p left outer join p.address a

Note that a usage of this advance behavior pretty much breaks the abstraction as one now has to know details of the annotated query when handing in a Sort instance. I'll probably point the Hibernate guys to this ticket as I still don't get entirely why the SQL query resolved is so different as the property paths of the sort criteria actually reference the very same thing.

Do other JPA (EclipseLink, OpenJPA) providers actually suffer from the same problems?

spring-projects-issues commented 12 years ago

Oliver Drotbohm commented

It turns out that the seen behavior is an implication of the JPA spec defining property expressions to always cause inner joins. I've brought this to the attention of the JPA expert group as I think this unfortunate (to phrase it politely). I'll discuss a few options regarding an improved Sort binding with the team and will report back

spring-projects-issues commented 12 years ago

Shelley J. Baker commented

Thanks for the prompt responses and looking into this.

If the sorting path (address.city in you case) is a valid property path on the entity, we prefix it with the entity alias. If not, we don't. . . . But I guess you could easily change that to something like

select p from Person p left outer join p.address a

This would provide us with a temporary workaround (I agree, however, that it's not ideal as you mentioned since it would break the abstraction); however, it appears that the SimpleJpaQuery/QueryUtils still does prefix the sort with the alias, even if it doesn't exist as a property of the base entity.

// The resulting sorted query string for the following, when the sort is "a.city" is:
// select p from Person p left outer join p.address a order by p.a.city asc
@Query("select p from Person p left outer join p.address a")
Page<Person> findPerson(Pageable pageable);
spring-projects-issues commented 12 years ago

Oliver Drotbohm commented

The idea I posted was a suggestion to discuss, not saying it is now implemented like this. I also though about introducing a qualifying annotation to give the user more control over how the Sort parameter is bound. In general the sorting abstraction is inted to be used to define sort attributes based on the root entity type, hence binding. Every reference to something else defined in the query (e.g. your join alias) will leak the query internals out of the repository.

As a workaround I recommend manually implementing the method like described in the reference documentation as it seems to be the only way to keep this implementation detail inside the repository layer.

Depending on the outcome of the discussion with the team we might also decide to not change anything at all as the way the code currently behaves is not wrong per se but influenced by the way JPA defines path expressions to be applied

spring-projects-issues commented 12 years ago

Oliver Drotbohm commented

This is fixed in the current snapshot, feel free to give it a spin. We now inspect the manually defined query for aliases used in left (outer) join expressions and skip prefixing if the property referenced in the Sort instance starts with the one of the detected aliases. This essentially expects you to know what you're doing (I assume you do :)). If you're aliasing the outer join to an alias that is equivalent to a root entity's property (just as in your scenario above) the behavior now is transparent to the repository's clients. So in the default use case the fact that you reference a join alias instead of a property does not leak into the client

spring-projects-issues commented 12 years ago

Shelley J. Baker commented

Thanks for resolving this so quickly. I took the snapshot for a spin and it works like a charm.

For reference to anyone else running into this problem in the meantime, to workaround this issue, I just added some methods to the repository that explicitly define the "order by" in the query for the problem properties. In the service from where we invoke the repository method, I just check to see if the requested order is one of the problem properties and call the alternate repository method with a copy of the pageable.

public Page<Person> findPerson(Pageable pageable) {

    Order order = pageable.getSort() != null ? pageable.getSort().iterator().next() : null;
    if (order != null && "address.city".equals(order.getProperty())) {
        Pageable orderlessPageable = new PageRequest(pageable.getPageNumber(), pageable.getPageSize());
        final Page<Person> pages;
        if (order.isAscending()) {
            pages = personRepository.findPersonOrderByCityAsc(orderlessPageable);
        } else {
            pages = personRepository.findPersonOrderByCityDesc(orderlessPageable);
        }
        return new PageImpl<Person>(pages.getContent(), pageable, pages.getTotalElements());
    }

    return personRepository.findPerson(pageable);
}

This isn't a robust solution but was a simple and feasible workaround in my case because we currently only need to support this type of sorting for one specific use case at this time. Also, the repository and code is internal so there isn't currently a need for us to be any more robust.

This might not work for everyone running into this problem but was a reasonable workaround for our use case, which we'll use until spring-data-jpa 1.2 is released with this fix

spring-projects-issues commented 11 years ago

P. J. Reed commented

I've found a case in which this is still not fixed. If you have a repository that extends JpaSpecficationExecutor and use the findAll method with a Specification, the resulting query still excludes null matches. I've updated the original reporter's test case to test this against the latest version and verify that it's broken. I'll attach it momentarily...

spring-projects-issues commented 11 years ago

P. J. Reed commented

Test case that shows this bug still exists in version 1.2 for repositories that extend JpaSpecificationExecutor

spring-projects-issues commented 11 years ago

Oliver Drotbohm commented

Is there a chance you open a new ticket for this? This ticket here is about using JPQL defined queries, not about specifications. The fix for this ticket has exactly fixed that. Especially as this fix has already been shipped I don't want to keep the ticket open for a different aspect of the problem

spring-projects-issues commented 11 years ago

P. J. Reed commented

Done; see DATAJPA-277