Open bergsmat opened 4 years ago
I think they should. Thanks for the report!
I disagree: the notations either use ^
and /
, or none of them and use space as multiplication; it should be
> as_units('kg m-2')
1 [kg/m^2]
the kg/m2
is a mix of both, where the 2 is not interpreted as square.
So still there's a mismatch, because it is interpreted as a square in the numerator.
> as_units('m2')
1 [m^2]
And it should be too in the denominator because anyway udunits interprets m2
as a square:
> as_units('m^2') * as_units('kg/m2')
1 [kg*m^2/m2]
> set_units(as_units('m^2') * as_units('kg/m2'), kg)
1 [kg]
We can try to have the case as_units('kg/m2')
generate a helpful error.
But why if udunits supports this?
> units:::R_ut_format(units:::R_ut_parse("m^2"))
[1] "m²"
> units:::R_ut_format(units:::R_ut_parse("m2"))
[1] "m²"
> units:::R_ut_format(units:::R_ut_parse("kg/m^2"))
[1] "m⁻²·kg"
> units:::R_ut_format(units:::R_ut_parse("kg/m2"))
[1] "m⁻²·kg"
You have a point there. So we either should support it too, or not support it and generate an error. Maybe second option on the short term, first on the long-term?
I can see where this comes from: we're using R's expression evaluation mechanism for things involving /
or ^
, and parse ourselves if not. R's expression evaluation sees m2
as a symbol, different from m^2
.
The only thing that is "broken" is the internal representation ("m2" in the denominator instead of c("m", "m")), and therefore the formatting. But given that uduntis supports "m2" as well, the behaviour is correct: it's working just fine, as the examples above demonstrate. So, in my opinion, raising an error now breaks backwards compatibility: there may be scripts out there using the "m2" notation both in numerators and denominators (disclaimer: I always use this notation, I'm lazy enough to avoid the "^" as much as possible).
@t-kalinowski Maybe you can take a look at this. The problem is here, which git blame
attributes to you: :)
For as_units("m2")
, are_exponents_implicit
returns TRUE
, and thus m2
is converted to m^(2)
below. But for as_units("m2/kg")
, returns FALSE
, and m2
is not converted. I think we always want to perform those conversions, right?, but convert_implicit_to_explicit_exponents
doesn't want either *
or /
.
I'm not fully remembering the intricacies of this, maybe you can help me remember why this is a dumb idea:
can we just always insert a "^"
anywhere a numeric is adjacent to non numeric that's not an operator?
E.g., some kind of regex that does:
make_all_exponents_explicit <- function(x)
gsub("([^\\d^*/]+)\\s*\\^?\\s*([\\d+\\.]?\\d+)", "\\1^\\2", x)
sapply(c("m^2", "m2", "kg/m^2", "kg/m2"),
make_all_exponents_explicit)
#> m^2 m2 kg/m^2 kg/m2
#> "m^2" "m^2" "kg/m^2" "kg/m^2"
Created on 2021-02-20 by the reprex package (v1.0.0)
Still need to handle negative exponents somehow.
If you don't remember, we have a problem. :grimacing: :dancers:
This is problematic:
as_units("m-2/g")
#> Error: cannot convert 1/g into m
#> Did you try to supply a value in a context where a bare expression was expected?
as_units("m-2/g", implicit_exponents=TRUE)
#> Error in convert_implicit_to_explicit_exponents(x) :
#> If 'implicit_exponents = TRUE', strings cannot contain `*' or `/'
No way forward with the current implementation.
Good thing this project has nice suite of unit tests! :relieved:
It's coming back to me. There used to be two completely different code paths for implicit and explicit exponents. Over time those converged quite a bit, and perhaps it's time to converge them even further. How about something like this:
make_all_exponents_explicit <- function(x)
gsub("([^\\d^*-/]+)\\s*\\^?\\s*(-?[\\d+\\.]?\\d+)", "((\\1)^(\\2))", x)
sapply(c("m^2", "m2", "kg/m^2", "kg/m2", "m-2/g", "g/m-2"),
make_all_exponents_explicit)
#> m^2 m2 kg/m^2 kg/m2 m-2/g
#> "((m)^(2))" "((m)^(2))" "kg/((m)^(2))" "kg/((m)^(2))" "((m)^(-2))/g"
#> g/m-2
#> "g/((m)^(-2))"
and the idea being that every supplied unit string has make_all_exponents_explicit
called on it before it is parsed.
Created on 2021-02-20 by the reprex package (v1.0.0)
I'm happy to submit a PR for this, but it might be about 2-3 weeks before I'll have time to sit down and actually work on this.
It's coming back to me. There used to be two completely different code paths for implicit and explicit exponents. Over time those converged quite a bit, and perhaps it's time to converge them even further.
Convergence sounds good to me. :)
How about something like this:
make_all_exponents_explicit <- function(x) gsub("([^\\d^*-/]+)\\s*\\^?\\s*(-?[\\d+\\.]?\\d+)", "((\\1)^(\\2))", x) sapply(c("m^2", "m2", "kg/m^2", "kg/m2", "m-2/g", "g/m-2"), make_all_exponents_explicit) #> m^2 m2 kg/m^2 kg/m2 m-2/g #> "((m)^(2))" "((m)^(2))" "kg/((m)^(2))" "kg/((m)^(2))" "((m)^(-2))/g" #> g/m-2 #> "g/((m)^(-2))"
and the idea being that every supplied unit string has
make_all_exponents_explicit
called on it before it is parsed.
But I don't think that udunits2 is able to parse so many parentheses.
I'm happy to submit a PR for this, but it might be about 2-3 weeks before I'll have time to sit down and actually work on this.
That would be great, thanks. We are preparing a release due to CRAN errors, but this needs to be properly considered and tested, so no hurries.
I think it's R that does the first parsing of these parenthesis, not udunits (unless this is something that's changed recently)
Thanks for a great package.
I notice that both m^2 and m2 share a common internal representation when used in a numerator, but not for a denominator. Should behaviors match?