r-quantities / units

Measurement units for R
https://r-quantities.github.io/units
175 stars 28 forks source link

Small numbers don't work with some logical operators #351

Closed ekatko1 closed 1 year ago

ekatko1 commented 1 year ago

I discovered a wierd bug where logical comparisons with small numbers silently fail.

Example:

1e-20 < 1e-10
# returns TRUE

set_units(1e-20, g) < set_units(1e-10,g)
# returns FALSE

Here is a little script to get more details

triangle <- function(x) {
  for (i in 1:length(x)) {
    print(x[which(x < x[i])])
  }
}

l = 10^(c(-9:-4))
u = set_units(l, g)

triangle(l)
triangle(u)

It seems that in this case, when the magnitude of the number on the RHS is 1e-7 or less, the comparison fails.

I am running R 4.3.1 and units-0.8-2 on the latest Linux Mint Cinnamon.

> R.version

platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          4                           
minor          3.1                         
year           2023                        
month          06                          
day            16                          
svn rev        84548                       
language       R                           
version.string R version 4.3.1 (2023-06-16)
nickname       Beagle Scouts               
edzer commented 1 year ago

Yes, that happens here: https://github.com/r-quantities/units/blob/main/src/udunits.cpp#L115 , and I'm not sure that is a good idea. What do you think, @Enchufa2 ?

Enchufa2 commented 1 year ago

The thing is... we need to convert units in order to compare values, but udunits2 has float precision. So that line referenced above is set so that things like set_units(3.6, km/h) == set_units(1, m/s) return TRUE. The side effect is that we lose precision for other comparisons too.

We could detect cases in which no conversion is involved and avoid using float tolerance in such cases. But then we are just chasing individual cases, and the example above would be TRUE, while set_units(1e-23, kg) < set_units(1e-10,g) would still be FALSE.

edzer commented 1 year ago

I'm pretty sure that also for using floats, 1e-20 < 1e-10 is TRUE.

edzer commented 1 year ago

See also https://stackoverflow.com/questions/48133572/what-can-stdnumeric-limitsdoubleepsilon-be-used-for

edzer commented 1 year ago

For 3.6 km/h sometimes not equaling 1 m/s we could refer to R faq 7.31.

Enchufa2 commented 1 year ago

Ok, since R users should be accustomed to make comparisons with tolerances via all.equal, let's remove that epsilon and restore the expected behavior for logical operators. It's still weird that set_units(1, m/s) == set_units(3.6, km/h) is TRUE but set_units(3.6, km/h) == set_units(1, m/s) is FALSE, but I suppose that's the price to pay to be consistent with base R.

Enchufa2 commented 1 year ago

I'm pretty sure that also for using floats, 1e-20 < 1e-10 is TRUE.

BTW, you are probably thinking of doubles, not floats:

Rcpp::evalCpp("std::numeric_limits<float>::epsilon()")
#> [1] 1.192093e-07
Rcpp::evalCpp("std::numeric_limits<double>::epsilon()")
#> [1] 2.220446e-16

But anyway I should have scaled that epsilon as the thread you link above points out. I left the proper code commented in #353 just in case we need to revisit this in the future in the C++ side.

edzer commented 1 year ago

It's still weird that set_units(1, m/s) == set_units(3.6, km/h) is TRUE but set_units(3.6, km/h) == set_units(1, m/s) is FALSE

Not if you consider that before comparison a conversion takes place that depends on the order of arguments (right is converted to the left). So the conversion is different in both cases.

Enchufa2 commented 1 year ago

Yes, of course, but the user doesn't necessarily know that this happens, so I expect this to be surprising.