Open sergey-morenets opened 2 years ago
Please provide a Minimimal Reproducable Example, preferable as a Github repository. Make sure to include the database, either as an in memory database or if that is not possible using Testcontainers.
Currently there are various scenarios where I expect one or the other to be faster.
Hi @schauder
This is repository with simplified example: https://github.com/sergey-morenets/spring-data-benchmarks
You can open this project in IDE and run of the classes: SpringDataJdbcBenchmarking or SpringDataJpaBenchmarking to execute the benchmarks.
This is simplified project so the absolute results are different but Spring Data JPA is still faster (7016 ns) against Spring Data JDBC (39617 ns).
If I interpret that correctly you don't measure actual loads with JPA but only the lookup in the first level cache, which is probably not what you want.
Invoke EntityManager.clear()
between benchmarks.
Hi @schauder
Thank you for the comment. Now I clear the entityManager 1st level cache at the beginning of the benchmark (I updated the repository):
@Benchmark
public Product springDataJpaQuery() {
entityManager.clear();
return productRepository.findByName("phone");
}
However it almost hasn't impacted the performance (execution time is 7479 ns)
are there any benchmarks between spring data jdbc and mybatis?
Thanks for the reproducer.
The main difference between the two benchmarks is, that you accidentally kicked out the Hikari connection pool by explicitly constructing the datasource. you can completely remove any Spring Data JDBC configuration and you'll see a significant performance boost.
It seems, we also do not properly cache the results of constructing the SQL statement from the method name, resulting in some significant overhead. You can workaround that by providing an explicit query.
The missing caching is something we'll fix.
Results of modified benchmark.
Benchmark Mode Cnt Score Error Units SpringDataJdbcBenchmarking.springDataJdbcQuery avgt 5 10103.975 ± 1140.990 ns/op SpringDataJpaBenchmarking.springDataJpaQuery avgt 5 8622.891 ± 1886.295 ns/op
You can find the modified benchmark here: https://github.com/schauder/spring-data-benchmarks
In general I would not expect better better performance from Spring Data JDBC compared to JPA implementations in the typical benchmark scenario.
The benefit of Spring Data JDBC is that it is much easier to understand what it is actually doing and therefor easier to use correctly.
This might very well result in better performance of real world applications due to few mistakes made.
Hi @schauder
I returned to the benchmarks topic and used your project. However I noticed you'd added a @Query annotation for query method (https://github.com/schauder/spring-data-benchmarks/blob/main/src/main/java/demo/jdbc/ProductRepository.java) Was it done intentionally? Because it completely changes the query logic:
public interface ProductRepository
extends CrudRepository<Product, Integer> {
@Query("select * from products")
Product findByName(String name);
}
After I've removed this annotation
public interface ProductRepository
extends CrudRepository<Product, Integer> {
Product findByName(String name);
}
and re-run the benchmarks the results are the following:
Benchmark Mode Cnt Score Error Units
SpringDataJdbcBenchmarking.springDataJdbcQuery avgt 5 30283.784 ± 643.013 ns/op
SpringDataJpaBenchmarking.springDataJpaQuery avgt 5 8669.489 ± 410.739 ns/op
So there's again significant gap between Spring Data JDBC and Spring Data JPA execution time.
Was it done intentionally?
Yes and no. I intentionally added the annotation to demonstrate the effect of not caching the generated query. Changing the query semantics was a mistake on my side.
It seems, we also do not properly cache the results of constructing the SQL statement from the method name, resulting in some significant overhead. You can workaround that by providing an explicit query. The missing caching is something we'll fix.
@schauder Is this something to expect in the upcoming version 3.0?
Is there any update on this? In Spring Data R2DBC, we've seen about a 10x performance increase by adding the @Query
for this query on a very large table (50+ columns), even with a small list of ids (<10)
fun findAllByIdInAndSomethingIsTrue(ids: List<Long>): Flow<MyEntity>
Hi
We have a Spring Data JPA (Hibernate) project that we would like to migrate to Spring Data JDBC. The main reasons are simplified configuration and model mapping. And we thought that it would lead to better performance.
However we did some benchmarks (with default settings) using JMH (H2 database) and it turned out that in most cases the performance has decreased. For example, we have the following repository and query method:
And the benchmarks showed the following execution time (in ns): Spring Data JDBC - 80899 Spring Data JPA - 14124
So is it predictable? Or we missed something in our configuration/tests?