r-quantities / units

Measurement units for R
https://r-quantities.github.io/units
175 stars 28 forks source link

The method set_units.mixed_units doesn't preserve a vector's ordering #271

Closed kgluborg closed 3 years ago

kgluborg commented 3 years ago

I deal with a lot of identical data coming from different sources, each using different units. The mixed_units class and function are very convenient to harmonize everything but since I upgraded to units v0.7 I have found that applying set_units() to a mixed_units vector now mess up the ordering of the initial vector.

library(units)

> Warning: package 'units' was built under R version 4.0.4

> udunits database from H:/HOME/Personal/R/win-library/4.0/units/share/udunits/udunits2.xml

a <- 3001:3010 b <- rep(c('m', 'yard'), 5) c <- mixed_units(a,b)

c

> Mixed units: m (5), yard (5)

> 3001 [m], 3002 [yard], 3003 [m], 3004 [yard], 3005 [m], 3006 [yard], 3007 [m], 3008 [yard], 3009 [m], 3010 [yard]

set_units(c, 'km', mode = "standard")

> Mixed units: km (10)

> 3.001 [km], 3.003 [km], 3.005 [km], 3.007 [km], 3.009 [km], 2.745029 [km], 2.746858 [km], 2.748686 [km], 2.750515 [km], 2.752344 [km]

The order of values is not preserved, the old 'm' are at the head of vector and the old 'yard' at the tail

The quick fix I had to use to preserve the sequence:

library(purrr) set_units(map2_dbl(c,'km', function(x,y) set_units(x,y,mode = 'standard')), 'km', mode = 'standard')

> Units: [km]

> [1] 3.001000 2.745029 3.003000 2.746858 3.005000 2.748686 3.007000 2.750515

> [9] 3.009000 2.752344

Enchufa2 commented 3 years ago

Thanks for the report.

@edzer This regression is worth a patch release.

edzer commented 3 years ago

submitted 0.7-1...

edzer commented 3 years ago

Ah, we overlooked these:

Dear maintainer,

package units_0.7-1.tar.gz has been auto-processed.
The auto-check found additional issues for the *last* version released on CRAN:
  rcnst <https://github.com/kalibera/cran-checks/tree/master/rcnst/results/units>
  valgrind <https://www.stats.ox.ac.uk/pub/bdr/memtests/valgrind/units>
CRAN incoming checks do not test for these additional issues and you will need an appropriately instrumented build of R to reproduce these.
Hence please reply-all and explain: Have these been fixed?

Log dir: <https://win-builder.r-project.org/incoming_pretest/units_0.7-1_20210311_102122/>
The files will be removed after roughly 7 days.
Installation time in seconds: 65
Check time in seconds: 156 
R Under development (unstable) (2021-03-10 r80084)

Pretests results:
Windows: <https://win-builder.r-project.org/incoming_pretest/units_0.7-1_20210311_102122/Windows/00check.log>
Status: OK
Debian: <https://win-builder.r-project.org/incoming_pretest/units_0.7-1_20210311_102122/Debian/00check.log>
Status: OK
edzer commented 3 years ago

I can't easily find the rcnst problem in units (happens in dplyr?), but valgrind seems to indicate sth with mixed units.

Enchufa2 commented 3 years ago

The rcnst issue is in dplyr. That check was done with dplyr 1.0.4. The new version 1.0.5 should have solved it, but CRAN has to re-run the check.

The valgrind issue is new. Quite strange, because nothing has changed in our C++ side. Let me investigate.

Enchufa2 commented 3 years ago

Ok, it's the same problem we had at exit due to the particular memory management model udunits2 has, which requires us to force weak finalizers before freeing the sys object. Now we have that same problem at startup, because now we allow changing the units system via load_units_xml(). So I've centralized the call to the garbage collector in https://github.com/r-quantities/units/commit/3e451b3e35e98f88250c3482401936a4dc5e3000, and there should be no valgrind issues now.