jakartaee / persistence

https://jakartaee.github.io/persistence/
Other
196 stars 58 forks source link

introduce several "missing" numeric functions #351

Closed gavinking closed 2 years ago

gavinking commented 2 years ago

This adds sign(), exp(), power(), ln(), round(), floor(), ceiling() to the list of functions portably supported in JPQL. Each of these functions (or something very similar) is available on essentially every SQL database, and are needed in any language which is intended to be used for any sort of nontrivial data analysis. In particular, the natural logarithm and functions for rounding are pretty essential.

I have (against my own preference) not included trigonometric functions.

gavinking commented 2 years ago

Would it make sense to add these functions to the CriteriaBuilder as well?

Yes, of course, indeed it does, and I'll add that to the PR.

gavinking commented 2 years ago

Would it make sense to add these functions to the CriteriaBuilder as well?

Yes, of course, indeed it does, and I'll add that to the PR.

Done.

gavinking commented 2 years ago

@lukasj do you want me to also add extract() to CriteriaBuilder? It's slightly more complex because I'll have to add an enum with the supported field types, but other than that, easy enough.

(I would do it in a separate PR though.)

gavinking commented 2 years ago

It's slightly more complex because I'll have to add an enum with the supported field types, but other than that, easy enough.

Hrm, in fact, it's a little tricky because not all the field types have the same result type, but it's still doable.

pzygielo commented 2 years ago

Would it be possible to have method names consistent with the ones from java.lang.Math?

gavinking commented 2 years ago

(I would do it in a separate PR though.)

See https://github.com/eclipse-ee4j/jpa-api/pull/356/files for a rough draft.

gavinking commented 2 years ago

Would it be possible to have method names consistent with the ones from java.lang.Math?

I mean, it's certainly possible, though I'm unclear on why that would be advantageous.

Speaking for myself, as a general principle I always avoid abbreviations unless the abbreviated usage is really completely ingrained, as is the case with avg(), for example.

It's not that case that pow() and ceil() are "ingrained" abbreviations: pow() isn't used in any dialect of SQL, and ceil() is used in only a minority of dialects.

And on the other hand, I very deliberately avoided log() instead of ln() because in many dialects of SQL, and in engineering, and often even in physics, log() means the base-10 logarithm. Now, I was educated as a mathematician so I write the natural logarithm as log(), but even in math it's recognized as ambiguous. So I think it's much better to use ln() here.

gavinking commented 2 years ago

(I should add, because perhaps it's not completely obvious, that the list of functions defined by JPQL is inspired much more by SQL than by Java.)

pzygielo commented 2 years ago

Would it be possible to have method names consistent with the ones from java.lang.Math?

I mean, it's certainly possible, though I'm unclear on why that would be advantageous.

And on the other hand, I very deliberately avoided log() instead of ln() because in many dialects of SQL, and in engineering, and often even in physics, log() means the base-10 logarithm. Now, I was educated as a mathematician so I write the natural logarithm as log(), but even in math it's recognized as ambiguous. So I think it's much better to use ln() here.

By methods I meant methods in CriteriaBuilder. Is its implementation expected to be used to write native SQL queries or rather be used in Java program? If the second - I'd just prefer to not re-learn new names.


As for log only - it's not just in j.l.Math:

C:

NAME
       log, logf, logl - natural logarithmic function

SYNOPSIS
       #include <math.h>

       double log(double x);
       float logf(float x);
       long double logl(long double x);

DESCRIPTION
       These functions return the natural logarithm of x.

Python:


math.log(x[, base])

    With one argument, return the natural logarithm of x (to base e).

Ruby:

 log(x) → Float
log(x, base) → Float

Returns the logarithm of x. If additional second argument is given, it will be the base of logarithm. Otherwise it is e (for the natural logarithm).

Haskell, ES6 - the same.

bc doesn't use ln neither.

Perhaps there are programming languages that do.


Maybe it's time and place to go with ln. Or have methods inconsistent with j.l.Math. Just saying - I don't like it.

gavinking commented 2 years ago

Is its implementation expected to be used to write native SQL queries or rather be used in Java program?

The philosophy of criteria queries in JPA is very much to be a faithful reification of JPQL in Java objects. The methods of CriteriaBuilder have the same names as the built-in JPQL functions.

Since JPQL is substantially based on SQL (not Java!), the functions are modeled after SQL functions, and so that's where the methods of CriteriaBuilder ultimately derive from.

Since log() is quite ambiguous in SQL (for example, log() in PostgreSQL means the base-10 logarithm), I was careful to pick a name that doesn't carry any possibility of confusion.

pzygielo commented 2 years ago

@gavinking - thank you for details/reasoning provided.

Tomas-Kraus commented 2 years ago

I'm implementing this in EclipseLink. Also we may need few JPA TCK tests to verify those API changes. :) Are there any volunteers?

Just few notes from my EclipseLink trials and errors:

@gavinking How do you handle this in Hibernate? Do you throw an exception when related database function is missing? Maybe spec can also define what to do when DB support is missing like in Derby/POWER,ROUND case.

gavinking commented 2 years ago

@Tomas-Kraus

  • Calling ln(n) for n<=0 returns null on some databases (MySQL) and throws an exception on others (Derby,Postgres)

Interesting, I had not noticed this.

In principle we could mandate the MySQL-like behavior, by translating ln(d) to ln(case when d<0 then null else d end) on Postgres and Derby, but I'm not sure it's worth the effort.

I would class this difference in behavior as fairly harmless, but I'm interested to hear what others think.

Derby DB does not have POWER and ROUND functions so they should not be mandatory

I'm totally fine with making these things non-mandatory, but FYI, in our implementation we emulate

on Derby.

Tomas-Kraus commented 2 years ago

@gavinking Yes, floor(x*10^n+0.5)/10^n works fine on Postgres. Unfortunately Derby emulations are bit more complicated.