josephwright / siunitx

A comprehensive (SI) units package for LaTeX
LaTeX Project Public License v1.3c
353 stars 26 forks source link

Uncertainty using "\pm" is not showing a correct result. #667

Closed piiskop closed 1 year ago

piiskop commented 1 year ago

The input:

$\qty{ 0,0068 \pm 9e-04}{\m}$

The output:

0.0068(90000) × 10−4 m

The expected output:

0.0068(9) × 10−4 m

The main part is 0.0068 as you see and it is not the power -4 of 10 but the power 0 of 10. Only 9 is in the power -4 of 10. The output applies the power -4 of 10 to the 0.0068 also which is wrong. The power of ten just before the unit applies to the main number, not to the uncertainty as the uncertainty applies only to the last digit of the main number. Please see the ninth edition of The International System of Units, page 152.

The way I format the numbers is how R does it if I use the function signif:

print(mean_of_I)# 34##  [1]  0,006847059
signif(x =mean_of_I,digits =2)# 18##  [1]  0,0068
print(delta_I)# 110##  [1]  0,0008898586
signif(x =delta_I,digits =1)# 112##  [1]  9e-04
josephwright commented 1 year ago

The number format used by siunitx for input treats 12.3\pm3e4 as equal to (12.3\pm3)\times10^{4}, and always has. This is because any exponent part needs to apply to the whole number to allow convenient parsing of the value, and also as there is no need for more than one exponent part - allowing two would risk inconsistent input.

piiskop commented 1 year ago

Please explain how is your assumption mathematically correct. If I do not use parenthesis then e4 only applies to 3 and not to 12.3. With parenthesis, yes it applies to both. siunitx should be compliant with current SI rules as the name tells us. Please refer to a current SI rule that allows such not only confusion but wrong interpretation. What would be your solution how to solve my case given the options in R.

josephwright commented 1 year ago

@piiskop The siunitx input format is designed to be easy-to-type and parse, and doesn't therefore use the same conventions as typeset mathematics. As I said, here input 0,0068 \pm 9e-04 will be converted for typesetting to $(0.0068 \pm 9.00000)\times 10^{-04}$, i.e. the brackets are 'read in' by the package. (As you might notice, I personally favour the 'short form' input, where this is a bit clearer.) BIPM are concerned with presentation of quantities for exchange of scientific information, not representation of quantities in all forms, so there is sense in which they tell us what 'should' happen here.

In terms of R output, I am not an R user so I would need to ask an expert. Note that in the main, my expectation is that values in siunitx are entered directly, other than in tables: if one is already doing programmatic manipulation of data, it would seem to me to be easier to do the entire job in R and return $0.0068 \pm 0.0009$ or similar to TeX.

piiskop commented 1 year ago

I do not agree that you call the package with the prefix si and you are not following SI completely. You are even not following the conventional math in favor of what? This is unclear to me. Is it a wrong math notation that you want to follow? I asked you to refer to the specific rule that allows you to represent an expression like that. You did not provide one. Please do so. I am using bookdown and I am not interested in doing additional formatting if there is siunitx. Otherwise, I could just avoid siunitx and do all the work without it. I just expect siunitx to follow actual real SI rules completely as BIPM does. Or feel free to rename your package if you want to go your distinct way! Currently, it is not only confusing but incorrect. Please also correct me if I am wrong on anything in this post.

josephwright commented 1 year ago

I do not agree that you call the package with the prefix si and you are not following SI completely.

BIPM's expertise is primarily in units; there is very little in the brochure about numbers or typography. I took another look at the page you referred to initially, but I couldn't see any mention of exponent notation.

You are even not following the conventional math in favor of what? This is unclear to me.

A short and convenient input syntax which can be parsed unambiguously by the code. As I've said, uncertainties and the main part of a number should be given with the same exponent, so brackets might be necessary for correct typesetting but are not a requirement for a parsable input with a clear rule set (which I believe is shown in the manual).

I am using bookdown and I am not interested in doing additional formatting if there is siunitx. Otherwise, I could just avoid siunitx and do all the work without it. I just expect siunitx to follow actual real SI rules completely as BIPM does.

The primary purpose ofsiunitx it typesetting units correctly. Whilst I get a lot more feature requests for numbers (particularly tables), it is that which drives the naming.

josephwright commented 1 year ago

I think it's important to emphasise that the e notation is of course not 'proper' mathematical symbolism, it's a shorthand useful for e.g. computer-generated output.

u-fischer commented 1 year ago

@piiskop you are confusing input and output. And you are assuming that the numbers are always entered explicitly and so that it is obvious to which digit the 9 should refer. But this doesn't need to be the case:

$\qty{  \fpeval{0.0183/3} \pm   9e-04}{\m}$

$\qty{  \fpeval{0.0183/2} \pm   9e-04}{\m}$

Do you really think it would be good if the uncertainty depended on internals of such a float calculation?

piiskop commented 1 year ago

Please see the example in Evaluation of measurement data [— Guide to the expression of uncertainty in measurement](https://www.bipm.org/documents/20126/2071204/JCGM_100_2008_E.pdf/cb0ef43f-baa5-11cf-3f85-4dcd86f77bd6?version=1.10&t=1659082531978&download=true) on the page 24:

EXAMPLE A calibration certificate states that the resistance of a standard resistor RS of nominal value ten ohms is 10,000 742 Ω ± 129 µΩ at 23 °C and that “the quoted uncertainty of 129 µΩ defines an interval having a level of confidence of 99 percent”. The standard uncertainty of the resistor may be taken as u(RS) = (129 µΩ)/2,58 = 50 µΩ, which corresponds to a relative standard uncertainty u(RS)/RS of 5,0 × 10−6 (see 5.1.6). The estimated variance is u2(RS) = (50 µΩ)2 = 2,5 × 10−9 Ω2.

SI is not only about units, it is about values that contain units.

e-notation or not, it does not matter. The problem is that the units can be different for the main part and for the uncertainty.

In the examples: The first one means 0.0061(9) m but siunitx interprets it as 0,0061(90000) m which is so wrong. The mean is 0,0061 m but the measurement uncertainty is 9 m. In this case, the mean could be just 0 as it makes no sense to have the ten thousandths if the uncertainty is in units.

u-fischer commented 1 year ago

The first one means 0.0061(9) m but siunitx interprets it as 0,0061(90000) m which is so wrong.

What is so difficult to understand that there is a difference between input and output? If you want this output

image

you have to give the uncertainty in the input format expected by siunitx:

\documentclass{article}
\usepackage{siunitx}

\begin{document}
$\qty{  0,0068 \pm  0.0009e-04}{\m}$
\end{document}

You may not like the input format but this is not relevant. The SI manual doesn't have any say about the siunitx input format, it doesn't know the LaTeX commands \qty and \pm and braces around the arguments anyway.

piiskop commented 1 year ago

You want me to render the input wrong. The measurement uncertainty is not 0.0009e-4 m. It is 0.0009 m. Forcing such manipulations makes using TeX even more complicated as it is anyway as this is not intuitive and mathematically is it incorrect anyway. And it is not SI.

Rmano commented 1 year ago

This is an input convention of the e<n> thing in siunitx (you can like or dislike it, but it's coherent in the package):

\documentclass{article}
\usepackage{siunitx}
\begin{document}
$\qty{0.0068\pm0.0009  e4}{\m}$

$\complexqty{0.0068-j0.0009  e4}{\ohm}$
\end{document}

image

I use spacing to show the effect in the input, too (which is less misleading).

jasperhabicht commented 1 year ago

Maybe, it is possible to add an option to the package which, if set to true, would change the way the input is parsed in such cases (probably like some kind of pre-parser, but I don't know well enough how the package works internally to be able to say whether this is really possible though).

However, since the package has been around for some while now and I think a lot of people use it, I don't think that is a good idea to radically change the way it parses the input, as it would probably lead to a lot of users having to change their code. So, whether the input syntax adheres to some standard or not, it should be kept at least for backwards compatibility. (And for the same reason, it is not really feasable to rename the package.)

As this package is open source, everybody is free to add their additions to the package code. So, in my opinion this should be seen as feature request.

josephwright commented 1 year ago

Please see the example in Evaluation of measurement data [— Guide to the expression of uncertainty in measurement](https://www.bipm.org/documents/20126/2071204/JCGM_100_2008_E.pdf/cb0ef43f-baa5-11cf-3f85-4dcd86f77bd6?version=1.10&t=1659082531978&download=true) on the page 24:

I'm aware of the document you link to, but my focus to-date has been on sections 7.2.2 and 7.2.4, which are about representation of uncertainty rather than linked to the calculation/meaning in a deeper sense. Those two sections are not entirely unambiguous. However, I note both use the (<value> +- <uncert>)<unit>, and from a brief reading the only example with two different magnitudes used is the one you link to.

e-notation or not, it does not matter. The problem is that the units can be different for the main part and for the uncertainty.

As you might tell, I feel that expression in a form with different magnitudes for the main and uncertainty parts is potentially-misleading. However, there are other presentations I find equally troubling but support as they are attested in the professionally-typeset literature. As such, I am happy to see this as a feature request for the numerical output, but I'd want to see at least a couple of independent examples from the literature to support it (that's a general thing: I always ask it if it's a feature request that comes in without me knowing of examples).

josephwright commented 1 year ago

You want me to render the input wrong. The measurement uncertainty is not 0.0009e-4 m. It is 0.0009 m.

As already noted, the model used in parsing by siunitx is that values will be given in a normalised input form in which the uncertainty is expressed at the same magnitude as the main value. I have very occasionally had feature requests for an alternative format, but they have typically been 'weak' requests (i.e. more misunderstanding the siunitx model than being desperate to use an alternative syntax). There would be a performance penalty for an alternative form (as I'd also need to allow for multiple uncertainties and normalise them after parsing), but it is doable - indeed alternative input parsing is on the 'to do' list. What is important to emphasise is that this would be orthogonal to any feature request for a change to the output: the entire reason for parsing values is that the package can manipulate values.

Rmano commented 1 year ago

@josephwright as I commented in the chat, the only time I was surprised with the input parsing was trying to do something like \complexqty{2e2 -j2e4}{\ohm} (but this is an error, so no problem). But that was just once, so I do not know if the usage frequency is worth the hassle...

(and in that case, \complexqty{0.2 -j20}{\kohm} is arguably much better).

piiskop commented 1 year ago

As such, I am happy to see this as a feature request for the numerical output, but I'd want to see at least a couple of independent examples from the literature to support it (that's a general thing: I always ask it if it's a feature request that comes in without me knowing of examples).

There can only be one ruleset and that is created by BIPM. Anything else is not SI.

josephwright commented 1 year ago

I looked back and the input question came up in #229 and was closed as a dupe of #208. I think that still stands.