Open jamarav opened 6 years ago
There are several issues here (please, open separate threads the next time).
When I evaluate the compressor efficiency [...] This unit is dimensionless.
This is an open issue (see e.g. #123). The thing is that units
maintains a unit representation at R level to be able to customise formatting and so on. However, simplification is not properly implemented at this level. On the other hand, udunits
has a binary representation with proper simplification, but then you have to rely on udunits
also to format units, which is not the best nor flexible at all. For example, with the udunits
branch:
units:::R_ut_format(units:::R_ut_parse("J/s/W"))
#> [1] "1"
but also (from #123):
units:::R_ut_format(units:::R_ut_parse("mile/gallon"))
#> [1] "425143.683171079 m⁻²"
There are ongoing efforts to move things (e.g., user-defined units) to the C part, because udunits
already manages all the hard work, but there is a trade-off when we consider, as I said, flexibility to format and print units, for instance.
One solution for this would be to provide a function simplify_units
that would parse the R representation into udunits
, but still we have to sort out how to parse the result back into R.
For now, you could add the following after your computations:
units(compressor_efficiency) <- 1
This will convert the efficiency to unitless, or fail with an error if units were misused in previous steps.
Is there an option to print it in the following form?
entropy 2175.70 J/(kg*K)
I don't think so. But:
units_options(negative_power=TRUE)
as_units("J/K/kg")
#> 1 J*K^-1*kg^-1
Finally, it is not very important but I would like to know if there is an option to print the units with parentheses:
Power_input 500 (W)
There is another option for this (group
, see ?units_options
), but it is currently applied to plots only. It may be extended to general formatting. @edzer thoughts?
Units appear now more consistently as e.g. 500 [W]
where you can change the [ ]
with units_options(group = c("(", ")"))
.
I'm in favour of makeing more aggressive simplification possible, need to look into how we could do this.
This function
to_si <- function(x) {
u_str = as.character(units(x))
u = units:::R_ut_parse(u_str)
ft = units:::R_ut_format(u, ascii = TRUE)
new = as_units(strsplit(ft, " ")[[1]][2])
set_units(x, new, mode = "standard")
}
converts to SI units. Shall we use that in case the user actively sets option simplify
to TRUE
? @Enchufa2 @t-kalinowski
> to_si(set_units(1, gallon/mile))
2.352146e-06 [m^2]
> to_si(set_units(1, gallon*mile))
6.09203 [m^4]
This feels like it should be its own option. Perhaps called standardize_to_si. I can think of lots of cases where a user might want to simplify, but not convert to si.
On Jun 30, 2018, at 7:58 AM, Edzer Pebesma notifications@github.com wrote:
This function
to_si <- function(x) { u_str = as.character(units(x)) u = units:::R_ut_parse(u_str) ft = units:::R_ut_format(u, ascii = TRUE) new = as_units(strsplit(ft, " ")[[1]][2]) set_units(x, new, mode = "standard") } converts to SI units. Shall we use that in case the user actively sets option simplify to TRUE? @Enchufa2 @t-kalinowski
to_si(set_units(1, gallon/mile)) 2.352146e-06 [m^2] to_si(set_units(1, gallon*mile)) 6.09203 [m^4] — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
I agree with @t-kalinowski. I also think that there may be cases in which someone may want to simplify some things and not others, convert to SI some things and not others. So it's nice to have these features as options, but having them as functions would be useful too.
The thing is that we have the opportunity to use simplify = TRUE
for this: with simplify = NA
by default right now setting units_options(simplify = TRUE)
only influences setting units to numeric:
> units_options(simplify = TRUE)
> set_units(1, mg/kg)
1e-06 [1]
> units_options(simplify = NA)
> set_units(1, mg/kg)
1 [mg/kg]
> units_options(simplify = FALSE)
> set_units(1, mg/kg)
1 [mg/kg]
Further simplification is now done always, by the package, by symbols comparison. We could branch this further and
simplify = NA
simplify = TRUE
to do simplification to SITo me, this sounds the simplest and most elegant approach.
We have:
> units_options(simplify = TRUE)
> set_units(1, "gallon*in/dgallon")
10 in
> units_options(simplify = NA)
> set_units(1, "gallon*in/dgallon")
1 gallon*in/dgallon
> units_options(simplify = FALSE)
> set_units(1, "gallon*in/dgallon")
1 gallon*in/dgallon
So you mean that the first result should be m
, the second one should be in
, and the third one should be the same? I'm still not convinced, because converting to SI is more than a simplification, it's, well, a conversion. It could be misleading for the user.
I'm not convinced either about simplify=NA
. What does it mean? Missing simplification? Then, it should be equivalent to simplify=FALSE
, so why not simplify=FALSE
by default?
Another thing you could do is to export to_si
and simplify
and document them together. Then you can explain there that
simplify=FALSE
(which should be the defult IMHO) does nothing.simplify=TRUE
applies the simplify
function.simplify=some_function
applies that function, e.g., simplify=to_si
.Thanks, valid point about in
and conversion to m
.
units_options(simplify = FALSE)
now turns all simplification off:
> units_options(simplify = FALSE)
> u
2 [m/s]
> u * 1/u
1 [m*s/m/s]
we have NA
for the combination of
set_units(1, mg/kg)
Maybe then add an option, say, convert_to_SI
, which when TRUE
takes over all symbolic stuff by converting to base SI units?
Another option is what @t-kalinowski proposed, and I think it's fine.
Regarding the name, what about convert_to_base
instead? The user could potentially uninstall all SI base units and install CGS units, for example. Then the conversion would be to CGS, not SI. The documentation may reflect that, by default, this "base" is SI units.
OK, we now have
> library(units)
udunits system database from /usr/share/xml/udunits
> units_options(convert_to_base=TRUE)
> set_units(1, gallon/km)
1 [gallon/km]
> set_units(1, gallon/km) * 1 # calls .simplify_units
3.785412e-06 [m^2]
where we convert to base when we simplify. Is that the right place, or should this not happen directly in set_units
?
Mmmh, if I set the global option to TRUE
, I would expect that the conversion happens always, i.e.:
> library(units)
udunits system database from /usr/share/xml/udunits
> units_options(convert_to_base=TRUE)
> set_units(1, gallon/km)
3.785412e-06 [m^2]
> set_units(1, gallon/km) * 1 # calls .simplify_units
3.785412e-06 [m^2]
That's why I was stressing the need to export simplification functions, including this new to_base
, because the user may want to simplify a few results while keeping the global options to FALSE
.
So, I guess this issue can be closed?
What do you think about my concern? Now, if convert_to_base=TRUE
, set_units(1, gallon/km) * 1
converts to base but set_units(1, gallon/km)
alone does not. I would expect an automatic conversion in both cases.
OK, I'll leave this open; needs a lot more love & patience to get this convert_to_base
running.
Well, apparently I've bumped into an issue that I opened a few years. What a coincidence!
I was actually looking for the functionality of convert_base()
reported by @edzer.
I think it could be very interesting. Perhaps it is not necessary to consider it as a general option, but as a simple function that we can call in case of need. In the future, if necessary, it could be included as a general option.
I don't have experience with the use of udunits from C and I suppose that as you say there is the possibility to install different base systems.
I have simply taken the function reported a few years ago by @edzer and modified it a bit.
I have conducted several tests and from what I understood, when evaluating units, we can find four typologies of string when capturing the units. For example:
"W" #[1]
"0.001 m" #[2]
"K @ 273.15" #[3]
"0.001 K @ 273150" #[4]
With this in mind the function would be:
convert_to_base <- function(x) {
u_str = base::as.character(base::units(x))
u = units:::R_ut_parse(u_str)
ft = units:::R_ut_format(u, ascii = TRUE)
ft = base::strsplit(x = ft, split = " @ ")[[1]][1]
ft = base::strsplit(x = ft, split = " ")[[1]]
ft = ft[length(ft)]
new = as_units(ft)
set_units(x, new, mode = "standard")
}
I think it could be very interesting to include it as a function available to the user. We would have a quick way to be able to convert to SI in case of need.
Well, it seem that the function that I reported above gets some errors. For example:
convert_to_base <- function(x) {
R_ut_parse = utils::getFromNamespace("R_ut_parse", "units")
R_ut_format = utils::getFromNamespace("R_ut_format", "units")
u_str = as.character(base::units(x))
u = R_ut_parse(u_str)
ft = R_ut_format(u, ascii = TRUE)
ft = strsplit(x = ft, split = " @ ")[[1]][1]
ft = strsplit(x = ft, split = " ")[[1]]
ft = ft[length(ft)]
new = units::as_units(ft)
units::set_units(x, new, mode = "standard")
}
x<-set_units(25, "g/mol")
x %>% convert_to_base()
# 40 [1/kg.mol]
I suppose the solution will be simple, but I don't know the internal function that takes care of these problems. Any idea?
Perhaps something like this:
library(units)
convert_to_base <- function(x) {
canonicalize <- function(s) {
s |>
R_ut_parse() |>
R_ut_format(TRUE, TRUE, TRUE) |>
gsub(" ", " * ", x = _)
}
u <- units(x)
u <- sprintf(
"( %s ) / ( %s )",
canonicalize(u$numerator),
canonicalize(u$denominator)
)
# message(u)
u <- as_units(str2lang(u))
u <- u / as.numeric(u)
# message(class(u))
# str(unclass(u))
units(x) <- u
x
}
environment(convert_to_base) <- asNamespace("units")
x <- set_units(25, "g/mol")
convert_to_base(x)
#> 0.025 [kg/mol]
set_units(25, ug/mol) |> convert_to_base()
#> 2.5e-08 [kg/mol]
set_units(25, mg/mol) |> convert_to_base()
#> 2.5e-05 [kg/mol]
set_units(25, g/mol) |> convert_to_base()
#> 0.025 [kg/mol]
set_units(25, kg/mol) |> convert_to_base()
#> 25 [kg/mol]
Thank you very much @t-kalinowski for the suggestions. The code you reported has helped me a lot. Unfortunately, it is perhaps a bit more complicated because of the freedom on the part of the user. Your code, for example, requires imperatively that the numerator or denominator does not contain a character(0)
. I have been doing some tests these days and have implemented the following function, which I think covers all cases.
convert_to_base <- function(x, simplify = T, merge_num_den = F) {
R_ut_parse <- utils::getFromNamespace("R_ut_parse", "units")
R_ut_format <- utils::getFromNamespace("R_ut_format", "units")
u_strBase <- function(u_str, spfy = T) {
u_new <- u_str |>
R_ut_parse() |>
R_ut_format(names = F, definition = T, ascii = T)
u_new <- strsplit(x = u_new, split = " @ ")[[1]][1]
u_new <- strsplit(x = u_new, split = " ")[[1]]
u_new <- u_new[length(u_new)]
if (spfy) {
u_new <- u_new |>
R_ut_parse() |>
R_ut_format(names = F, definition = F, ascii = T)
}
u_new <- u_new |>
gsub(".", " ", fixed = T, x = _)
return(u_new)
}
u <- base::units(x)
u <- sapply(u, function(i) paste0(i, collapse = "*", recycle0 = T))
u[u == ""] <- "1"
u["numerator"] <- sprintf("(%s)", u["numerator"])
u["denominator"] <- sprintf("(%s)", u["denominator"])
if (merge_num_den) u <- paste(u, collapse = "/")
u_base <- sapply(u, function(j) u_strBase(u_str = j, spfy = simplify))
if (merge_num_den) {
u_base <- sprintf("(%s)", u_base)
} else {
unitless <- (u_base == "1")
u_base["numerator"] <- sprintf("(%s)", u_base["numerator"])
u_base["denominator"] <- sprintf("(%s)-1", u_base["denominator"])
u_base <- u_base[!unitless]
u_base <- paste(u_base, collapse = " ")
}
units::set_units(x, u_base, mode = "standard", implicit_exponents = T)
}
The basic operation would be as follows:
A unit object is sent to the function. It internally captures its units and creates a vector u
that distinguishes between the numerator and denominator. Then, the u_strBase
function takes care of converting to base units. During my tests, I think setting names = F
in R_ut_format
simplifies the output format so it can be easily reformatted later to apply unit conversion by using set_units
. But, the most important thing to convert to base units is setting definition = T
. Furthermore, some splits are required from the output of R_ut_format
to identify the string that refers only to the units. Once the string referring to the units is captured, the reformatting is quite simple. With names = F
, you only have to replace the multiplication represented by "." with a space. Then I concatenated numerator and denominator units and used set_units
by setting implicit_exponents = T
.
Furthermore, I implemented some other functionalities. convert_to_base
includes two variable options: simplify
and merge_num_den
.
The option simplify
enables a second call to R_ut_format
. I found that by concatenating two calls to R_ut_format
, we can:
definition = T
in R_ut_format
).definition = F
in R_ut_format
). That is why I have left simplify=T
as default.Concerning the second option, merge_num_den,
it allows merging numerator and denominator before calling u_strBase
. This is only useful in certain cases, such as converting kJ/s
to W
. This is how I started describing this function, but after some testing, I found that it is much more consistent to let R_ut_format
apply separate simplifications to the numerator or denominator. That is why I have left merge_num_den=F
as default, as it can only be useful in some assumptions, and in others, it gives worse results. A clear example is the enthalpy (kJ/kg) where applying the simplifications without distinguishing numerator from denominator, we get "Gy" (Gray: J/kg), which in my case makes little sense when talking about enthalpies.
@edzer @Enchufa2 and @t-kalinowski , I hope this will help you implement the convert_base
function and include it in future package versions. I think the function I report is working consistently, but I am open to any suggestions for improvement.
Finally, here are some tests I have carried out on this function:
u <- "kJ/kg"
set_units(32, u, mode = "standard") |> convert_to_base()
#> 32000 [J/kg]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T)
#> 32000 [Gy]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = F, simplify = F)
#> 32000 [kg*m^2/kg/s^2]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T, simplify = F)
#> 32000 [m^2/s^2]
u <- "fahrenheit"
set_units(32, u, mode = "standard") |> convert_to_base()
#> 273.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T)
#> 273.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = F, simplify = F)
#> 273.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T, simplify = F)
#> 273.15 [K]
u <- "celsius"
set_units(32, u, mode = "standard") |> convert_to_base()
#> 305.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T)
#> 305.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = F, simplify = F)
#> 305.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T, simplify = F)
#> 305.15 [K]
u <- "degree_C"
set_units(32, u, mode = "standard") |> convert_to_base()
#> 305.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T)
#> 305.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = F, simplify = F)
#> 305.15 [K]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T, simplify = F)
#> 305.15 [K]
u <- "kJ/(kg*fahrenheit)"
set_units(32, u, mode = "standard") |> convert_to_base()
#> 57600 [J/K/kg]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T)
#> 57600 [m^2/K/s^2]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = F, simplify = F)
#> 57600 [kg*m^2/K/kg/s^2]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T, simplify = F)
#> 57600 [m^2/K/s^2]
u <- "J/s"
set_units(32, u, mode = "standard") |> convert_to_base()
#> 32 [J/s]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T)
#> 32 [W]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = F, simplify = F)
#> 32 [kg*m^2/s^3]
set_units(32, u, mode = "standard") |> convert_to_base(merge_num_den = T, simplify = F)
#> 32 [kg*m^2/s^3]
This definitely helps. Thanks all for the discussion and prototypes. I'll try to find some time to put things together. But this will be during the next half-term, because I'm a bit overloaded now.
Good afternoon. I'm starting to use the library units (I think it's a very useful tool). I have a couple of doubts about the library. I'm programming a small script to evaluate the performance in refrigeration compressors. I have the following problem: I evaluate the compressor efficiency with the following expression:
compressor_efficiency<-(mref*Dhs)/Wcomp
Previous to this calculation I define the following variables:
When I evaluate the compressor efficiency:
compressor_efficiency<-(mref*Dhs)/Wcomp
The units are:This unit is dimensionless. Is there any way to indicate that internally interpret J / s as W? The result would be:
Another doubt would be the following: When I define a new variable, entropy:
entropy<-set_units(vector_data_entropy, 'J/(kg*K)')
The printed result is:Is there an option to print it in the following form?
I think that is much clearer in this way separating numerator and denominator.
Finally, it is not very important but I would like to know if there is an option to print the units with parentheses:
Thanks in advance