python / cpython

The Python programming language
https://www.python.org
Other
63.38k stars 30.35k forks source link

decimal.to_eng_string() does not implement engineering notation in all cases. #70411

Closed 423de8fc-0a99-4714-a0e6-1eb832ed64f6 closed 8 years ago

423de8fc-0a99-4714-a0e6-1eb832ed64f6 commented 8 years ago
BPO 26223
Nosy @tim-one, @rhettinger, @facundobatista, @mdickinson, @ericvsmith, @ezio-melotti, @skrah, @ztane
Files
  • [engineering_notation.pdf](https://bugs.python.org/file41730/engineering_notation.pdf "Uploaded as "application/pdf" at 2016-01-27.19:15:12 by serge.stroobandt"): Proper definition of engineering notation
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/rhettinger' closed_at = created_at = labels = ['extension-modules', 'type-feature'] title = 'decimal.to_eng_string() does not implement engineering notation in all cases.' updated_at = user = 'https://bugs.python.org/sergestroobandt' ``` bugs.python.org fields: ```python activity = actor = 'serge.stroobandt' assignee = 'rhettinger' closed = True closed_date = closer = 'rhettinger' components = ['Extension Modules'] creation = creator = 'serge.stroobandt' dependencies = [] files = ['41730'] hgrepos = [] issue_num = 26223 keywords = [] message_count = 18.0 messages = ['259047', '259054', '259058', '259059', '259061', '259062', '259107', '259118', '259271', '259772', '272415', '272417', '272428', '272431', '272562', '272563', '272565', '282487'] nosy_count = 10.0 nosy_names = ['tim.peters', 'rhettinger', 'facundobatista', 'mark.dickinson', 'eric.smith', 'ezio.melotti', 'skrah', 'Keith.Brafford', 'ztane', 'serge.stroobandt'] pr_nums = [] priority = 'normal' resolution = 'rejected' stage = None status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue26223' versions = ['Python 3.6'] ```

    423de8fc-0a99-4714-a0e6-1eb832ed64f6 commented 8 years ago

    In https://docs.python.org/2/library/string.html#formatstrings the proprietary (IBM) specifcation "Decimal Arithmetic Specification" http://www.gobosoft.com/eiffel/gobo/math/decimal/daconvs.html is incorrectly being heralded as "the" specifiaction for engineering notation.

    However, upon reading this IBM specifation carefully, one will note that the specifaction itself actually admits not applying the engineering notation in the case of infinite numbers.

    An emphasized version of the exact quote accompanied with a discussion can be found here: http://stackoverflow.com/a/17974598/2192488

    Correct behaviour for decimal.to_eng_string() would be to equally employ engineering notation in the case of infinite numbers.

    I suggest renaming the current behaviour to decimal.to_ibm_string().

    References: http://www.augustatech.edu/math/molik/notation.pdf https://en.wikipedia.org/wiki/Engineering_notation https://en.wikipedia.org/wiki/General_Conference_on_Weights_and_Measures http://www.bipm.org/en/CGPM/db/11/11/

    PS: I am a MSc in Electronic Engineering.

    423de8fc-0a99-4714-a0e6-1eb832ed64f6 commented 8 years ago

    An emphasized version of the exact quote is here now: http://stackoverflow.com/a/35045233/2192488

    rhettinger commented 8 years ago

    The decimal module strives to exactly follow the spec even when our sensibilities suggest otherwise. Perhaps, we can add a note to the current docs describing the situation. The API itself is a settled issue, that ship sailed a very long time ago (the problem with a standard library is that it becomes standard that people rely on and is hard to change after the fact without causing issues for users).

    ericvsmith commented 8 years ago

    Agreed. And, since any API change would just be a 3.6+ change, this would increase the difficulty of moving between 2.7 and 3.x. Which is not something we want.

    423de8fc-0a99-4714-a0e6-1eb832ed64f6 commented 8 years ago

    @rhettinger I completely agree with not creating a backward incompatibility at this point in time.

    The real issue is that decimal.to_eng_string() was written to a (unfortunately chosen) proprietary specification which does not entirely correspond to the engineering notation.

    A quick web search shows that a lot of people are in search of a *true* engineering notation implementation. In the phylosophy of "batteries included" it is a pity this useful and very common notation is currently missing in Python.

    I would therefore suggest adding a decimal.to_true_eng_string() with the true engineering notation.

    Hence, this bug could be reclassified as asuggestion for enhancement.

    5531d0d8-2a9c-46ba-8b8b-ef76132a492c commented 8 years ago

    The spec was the only reasonable choice at the time decimal.py was written. Incidentally, it is basically IEEE-754-2008 with arbitrary precision extensions; this isn't surprising since Mike Cowlishaw was on the IEEE committee and wrote the spec at the same time.

    There are very few decNumber-specific remainders in the spec -- this may be one of those (though I did not bother to look up if the IEEE standard specifies formatting at all).

    mdickinson commented 8 years ago

    I also agree that we shouldn't change the current code. As Raymond says, it may be worth a doc change.

    Serge: I was confused by your initial report. If I understand the StackOverflow question correctly, this isn't about the output for infinite numbers (e.g., Decimal('inf') and Decimal('-inf')), and I'm not sure what that would mean. Rather, it's about the output for small finite numbers, where an exponent wouldn't be used in the normal scientific notation. So some people would (understandably) rather see:

    >>> Decimal('123456').to_eng_string()
    '123.456e3'
    >>> Decimal('0.02').to_eng_string()
    '20e-3'

    than the current

    >>> Decimal('123456').to_eng_string()
    '123456'
    >>> Decimal('0.02').to_eng_string()
    '0.02'

    for example. Is that what you meant?

    Stefan: IEEE 754 does cover formatting (in section 5.12, "Details of conversion between floating-point data and external character sequences"), but has nothing to say about engineering formats.

    423de8fc-0a99-4714-a0e6-1eb832ed64f6 commented 8 years ago

    Mark: Don't shoot the messenger! I literally quoted the implemented proprietary specification. However, I do agree that the term "numbers (or bases) with an infinte decimal representation" would be more appropriate in this context.

    Also, improving documentation is good, but having a new function with the desired *true* engineering notation would be even better! Admittedly, this was my ultimate objective for filing this enhancement bug.

    Thanks for commenting on StackExchange.

    423de8fc-0a99-4714-a0e6-1eb832ed64f6 commented 8 years ago

    Related issue: https://bugs.python.org/issue8060

    rhettinger commented 8 years ago

    I'm disinclined to make a new method and instead prefer to go the route of having a formatting option. For the most part, we want the methods to be limited to those in the spec (the API is already huge). The other reason is that output formatting options are what __format__() method was intended to address.

    If you want to go further, it would be reasonable to bring https://bugs.python.org/issue8060 back to life. The final entry in that tracker item recommended moving the discussion to python ideas or into a PEP (like was done for the thousands separator format option). There are many viewpoints to consider before jumping to codify one particular approach into the standard library.

    ff59cd45-ebe3-4b3e-9696-65dc59a38b8c commented 8 years ago

    Indeed engineering notation is now utterly broken, the engineering notation is not printed for pretty much _any *engineering numbers at all_ in 3.6. Engineering numbers mean numbers that could be met in an *engineering context, not cosmological!

    5531d0d8-2a9c-46ba-8b8b-ef76132a492c commented 8 years ago

    @Antti Please think before you write and stop making unfounded allegations.

    ff59cd45-ebe3-4b3e-9696-65dc59a38b8c commented 8 years ago

    Ok, after reading the "spec" it seems that the engineering exponent is indeed printed for positive exponents *if* the precision of the number is less than the digits of the exponent, which I didn't realize that I should be testing.

    However the *precision* of decimals is meaningless anyhow. Add a very precisely measured '0e0' to any number and the sum also has exponent of 0, and is thus never displayed in exponential notation.

    5531d0d8-2a9c-46ba-8b8b-ef76132a492c commented 8 years ago

    On Thu, Aug 11, 2016 at 09:17:10AM +0000, Antti Haapala wrote:

    However the *precision* of decimals is meaningless anyhow. Add a very precisely measured '0e0' to any number and the sum also has exponent of 0, and is thus never displayed in exponential notation.

    It is not meaningless and actually one of the most important features of decimal:

    >>> x = Decimal("3.6")
    >>> y = Decimal("0.0000000000000000000000") # number "measured" with ridiculous precision
    >>> x.to_eng_string()
    '3.6'
    >>> (x + y).to_eng_string()
    '3.6000000000000000000000'
    
    >>> x = Decimal("3.6")
    >>> y = Decimal("0e-7") # perhaps more realistic
    >>> (x + y).to_eng_string()
    '3.6000000'

    If you have confidence in your measurement, you have to let decimal know by actually spelling it out.

    423de8fc-0a99-4714-a0e6-1eb832ed64f6 commented 8 years ago

    What most engineers would like to see implemented in Python is a new engineering notation identical to the one implemented in the omnipresent HP calculators.

    Quoting from the HP-15C Owner's Handbook: "- In engineering notation, the first significant digit is always present in the display. The number you key in after f ENG specifies the number of additional digits to which you want to round the display.

    Source + examples, see page 59: http://www.hp.com/ctg/Manual/c03030589.pdf

    Most of the time, engineers are not after high precision. Ball park figures are good enough in a world where everything is built to a specified tolerance. For example, most electronic resistors feature 5% tolerance. Safety factors take care of the rest and assure a building will not collapse.

    This should not be that difficult to implement? I promise, every six months an engineer will stop by here asking for this. Instead of nagging, this could already have been implemented one way or the other. The large demand for this feature really warrants it. Thanks!

    5531d0d8-2a9c-46ba-8b8b-ef76132a492c commented 8 years ago

    On Fri, Aug 12, 2016 at 08:38:52PM +0000, Serge Stroobandt wrote:

    This should not be that difficult to implement? I promise, every six months an engineer will stop by here asking for this. Instead of nagging, this could already have been implemented one way or the other. The large demand for this feature really warrants it. Thanks!

    To whom should I send the invoice?

    52c115ac-1d3b-482c-9271-bf59494076c0 commented 8 years ago

    Serge, I wrote this awhile back, before I learned you aren't supposed to subclass built-in types. Is this the type of effect you're looking for?

    https://gist.github.com/kbrafford/da39e06d18b6df2a07777eecb4493699

    Here's an example using it: https://gist.github.com/kbrafford/e0115e796890fcefb4f0c35248bd05f1

    423de8fc-0a99-4714-a0e6-1eb832ed64f6 commented 7 years ago

    Dear Keith, that is exactly how it should be! (I cross-checked with a HP calculator to make sure.)