Move Demo/classes/Rat.py to Lib/fractions.py and fix it up.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

BPO	1682
Nosy	@gvanrossum, @rhettinger, @facundobatista, @mdickinson, @ncoghlan, @wm75
Files	rational.patch rational.patch: V2, ported to 2.6 rational.patch: V3, still minimal rational_tweaks.patch: Some tweaks on top of the original implementation lehmer_gcd.py: Revised Lehmer gcd algorithm lehmer_gcd_2.py: Benchmark gcd implementations lehmer_gcd.patch fraction_inline_arith.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields: ```python assignee = None closed_at = created_at = labels = ['type-bug', 'library'] title = 'Move Demo/classes/Rat.py to Lib/fractions.py and fix it up.' updated_at = user = 'https://bugs.python.org/jyasskin' ``` bugs.python.org fields: ```python activity = actor = 'wolma' assignee = 'jyasskin' closed = True closed_date = closer = 'mark.dickinson' components = ['Library (Lib)'] creation = creator = 'jyasskin' dependencies = [] files = ['9021', '9114', '9152', '9201', '9464', '9485', '9486', '9570'] hgrepos = [] issue_num = 1682 keywords = ['patch'] message_count = 92.0 messages = ['58949', '59424', '59425', '59582', '59583', '59608', '59614', '59615', '59633', '59635', '59636', '59639', '59640', '59645', '59650', '59651', '59656', '59657', '59659', '59665', '59747', '59755', '59772', '59773', '59870', '59884', '59886', '59958', '59982', '60080', '60081', '60131', '60132', '60133', '60137', '61737', '61995', '62001', '62005', '62010', '62011', '62013', '62028', '62029', '62099', '62189', '62192', '62218', '62222', '62227', '62236', '62249', '62251', '62252', '62253', '62254', '62256', '62257', '62258', '62271', '62305', '62311', '62318', '62320', '62324', '62335', '62337', '62340', '62380', '62401', '62403', '62406', '62547', '62550', '62564', '62566', '62567', '62568', '62671', '62672', '63087', '63126', '68725', '68732', '68757', '68822', '68823', '68843', '68851', '69020', '69027', '69069'] nosy_count = 7.0 nosy_names = ['gvanrossum', 'rhettinger', 'facundobatista', 'mark.dickinson', 'ncoghlan', 'jyasskin', 'wolma'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = None status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue1682' versions = ['Python 2.6', 'Python 3.0'] ```

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

This is written against the 3.0 branch, but before submitting it, I need to backport numbers.py and submit it against 2.6 instead. I'm posting it now to get a head start on comments.

The numbers.py part is necessary for this to work but will be a separate patch.

mdickinson commented 16 years ago

Some random comments: take with a large pinch of salt

(1) In __init__ I think you want something like:

  self._numerator = int(numerator)//g

instead of

  self._numerator = int(numerator / g)

and similarly for the denominator. At the moment I get, for example:

>>> Rational(10**23)
rational.Rational(99999999999999991611392,1)

(2) What's the reason for repr of a Rational being "rational.Rational(...)", while repr of a Decimal is just "Decimal(...)"? I'm not suggesting that either is wrong; just wondering whether there's some sort of guiding principle that would suggest one or the other.

(3) My first thought on looking at the _gcd function was: "Shouldn't there be an abs() somewhere in there"; since the gcd of two (possibly negative) integers is often (usually?) defined as the largest *nonnegative* common divisor. But on closer examination it's clear that the result of _gcd(a, b) has the same sign as that of b (unless b == 0). So _gcd very cleverly chooses exactly the right sign so that the denominator after rescaling is positive. I'm not sure whether this is a happy accident or clever design, but either way it probably deserves a comment.

(4) the __ceil__ operation could be spelt more neatly as

return -(-a.numerator // a.denominator)

...but perhaps that just amounts to obfuscation :)

(5) It looks as though two-argument round just truncates when the second argument is negative. Presmably this isn't what's wanted here?

>>> round(Rational(26), -1)  # expecting Rational(30, 1)
rational.Rational(20,1)

(6) The behaviour of the equality test is questionable when floats are involved. For example:

>>> 10**23 == float(10**23)  # expect False
False
>>> Rational(10**23) == float(10**23)  # I'd also expect False here
True

Ideally, if x is a Rational and y is a float then I'd suggest that x == y should return True only when the numeric values really are equal. Coding this could be quite tricky, though. One option would be to convert the float to an (exactly equal) Rational first-- -there's code to do this in the version of Rational.py in the sandbox.

(7) For purely selfish reasons, I for one will be very happy if/when this makes it into 2.6/3.0: I use Python a lot for teaching (geometry, number theory, linear algebra, ...); it's natural to work with rational numbers in this context; and it'll be much easier to tell an interested student to just download Python than to tell them they also need gmp and gmpy, or some other third party package, just to try out the code examples I've given them.

mdickinson commented 16 years ago

Two more questions:

(1) should a Rational instance be hashable? (2) Should "Rational(5,2) == Decimal('2.5')" evaluate to True or False?

If Rationals are hashable and 2.5 == Rational(5, 2) == Decimal('2.5') then the rule that objects that compare equal should have equal hash is going to make life *very* interesting...

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

Thanks again for the excellent comments.

__init__: good catch.

repr(Rational): The rule for repr is "eval(repr(object)) == object". Unfortunately, that doesn't decide between the two formats, since both assume some particular import statements. I picked the one more likely to be unique, and I assume Decimal picked the shorter one. I can go either way.

_gcd's sign: It's a happy accident for me. Possibly Sjoerd Mullender designed it that way. I've added a comment and a test.

__ceil__: I like that implementation better.

2-argument round: Fixed and tested.

equality: Very good point. I've stolen the sandbox code and added Rational.from_float() using it. I think I also need to make this change to the comparisons.

hashing: oops, yes these should be hashable. Decimal cheats by comparing != to even floats that it's equal to, so I'm going to assume that they also want Rational(5,2) != Decimal('2.5').

The new patch is against 2.6.

rhettinger commented 16 years ago

FWIW, I'm -1 on moving this module to the standard library. There has been basically *zero* demand for something like this. Because rational implementations seem to be more fun to write than to use, I think there are some competing implementations.

gvanrossum commented 16 years ago

Raymond, do you *always* have to be such a grinch?

The mere existance of competing implementations proves that there's a need. Hopefully having one in the stdlib will inspire others to contribute improvements rather than starting from scratch.

rhettinger commented 16 years ago

Not sure I'm the grinch on this one. I thought PEPs on this were rejected long ago and no one complained afterwards. After years of watching newsgroup discussions and feature requests, I do not recall a single request or seeing a single problem that was better solved with a rational package. On python-help and the tutorial newsgroup, I've never seen a question about the existing module in the Tools directory or even a question on the topic.

I think rational modules are ubiquitous for the same reason as Sudoku solvers -- they are cute, an easy exercise, and fun to write. There is some variation in how each chooses to make compromises so the denominators do not get unusably large. Also, there is some variation in sophistication of the GCD/LCD algorithm.

I had thought the standards for inclusion in the standard library had risen quite a bit (best-in-class category killers and whatnot). ISTM, this is aspiring cruft -- I cannot remember encountering a real business or engineering problem that needed rational arithmetic (the decimal module seems to meet the rare need for high precision arithmetic).

All that being said, maybe the module with serve some sort of educational purpose or will serve to show-off the numeric tower abstract base classes.

facundobatista commented 16 years ago

The PEP-239 is Rejected (http://www.python.org/dev/peps/pep-0239/).

If a Rational data type would be included in the stdlib, my recommendation is that first that PEP would be reopened and pushed until get accepted.

Also, note the kind of questions Mark is asking here: the analysis and answer of those questions belongs to a PEP, not to a patch. We risk to not be very rational (1/2 wink) in these decisions.

I'd close this patch as Rejected (as for the PEP), at least until the PEP gets accepted (if ever).

gvanrossum commented 16 years ago

The rejection of PEP-239 was years ago. PEP-3141 was accepted; it includes adding a rational type to the stdlib, and Jeffrey is doing this with my encouragement.

The motivation is to have at least one example of each type in our modest numeric tower in the stdlib. I'm sure there will be plenty of uses for rational.py, e.g. in education. Of course, if someone has a better candidate or a proposed improvement, speak up now! It looks like Mark's suggestions can be treated as code review; I don't think we need a new PEP.

Note that the mere existence of PEP-239 again points out that the demand is not zero.

rhettinger commented 16 years ago

If this goes in, I have some recommendations:

Follow the decimal module's lead in assiduously avoiding float->rational conversions. Something like Rat.from_float(1.1) is likely to produce unexpected results (like locking in an inexact input and having an unexpectedly large denominator).
Likewise, follow decimal's lead in avoiding all automatic coercisions from floats: Rational(3,10) + 0.3 for example. The two don't mix.
Consider following decimal's lead on having a context that can limit the maximum size of a denominator. There are various strategies for handling a limit overflow including raising an exception or finding the nearest rational upto the max denominator (there are some excellent though complex published algorithms for this) or rounding the nearest fixed-base (very easy). I'll dig out my HP calculator manuals at some point -- they had worked-out a number of challenges with fractional arithmetic (including their own version of an Inexact tag).
Consider adding Decimal.from_rational and Rational.from_decimal. I believe these are both easy and can be done losslessly.
Automatic coercions to/from Decimal need to respect the active decimal context. For example the result of Rational(3,11) + Decimal('3.1415926') is context dependent and may not be commutative.
When in doubt, keep the API minimal so we don't lock-in design mistakes.
Test the API by taking a few numerical algorithms and seeing how well they work with rational inputs (for starters, try http://docs.python.org/lib/decimal-recipes.html ).
If you do put in a method that accepts floats, make sure that it can accept arguments to control the rational approximation. Ideally, you would get something something like this Rational.approximate(math.pi, 6) --> 355/113 that could produce the smallest rationalal approximation to a given level of accuracy.

facundobatista commented 16 years ago

2008/1/9, Raymond Hettinger said:

Consider adding Decimal.from_rational and Rational.from_decimal. I believe these are both easy and can be done losslessly.

If it's lossless, why not just allow Decimal(Rational(...)) and Rational(Decimal(...))?

rhettinger commented 16 years ago

If it's lossless, why not just allow Decimal(Rational(...)) and Rational(Decimal(...))?

Those conversions are fine.

It's the float\<-->rational conversions that are problematic. There are exact float to rational conversions using math.frexp() but I don't think the results tend to match what users expect (since 1.1 isn't exactly represented). Also, there may be overflow issues with trying to go from rationals to floats when the denomintor is very large.

I think the history of rational modules is that simplistic implementations get built and then the writers find that exploding denominators limit their usefulness for anything other than trivial problems. The solution is to limit denominators but that involves less trivial algorithms and a more sophisticated API.

gvanrossum commented 16 years ago

On Jan 9, 2008 4:29 PM, Raymond Hettinger \report@bugs.python.org\ wrote:

I think the history of rational modules is that simplistic implementations get built and then the writers find that exploding denominators limit their usefulness for anything other than trivial problems. The solution is to limit denominators but that involves less trivial algorithms and a more sophisticated API.

A "rational" type that limits denominators (presumably by rounding) isn't needed -- we alreay have Decimal. The proposed rational type is meant for "pure" mathematical uses, just like Python's long ints.

Regarding float->rational, I propose to refuse Rational(1.1) for the same reasons as Decimal(1.1) is refused, but to add a separate constructor (a class method?) that converts a float to a rational exactly (as long as it isn't nan or inf), large denominators be damned. This can help people who are interested in taking floating point numbers apart.

float(Rational()) should be fine.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

Follow the decimal module's lead in assiduously avoiding float->rational conversions. Something like Rat.from_float(1.1) is likely to produce unexpected results (like locking in an inexact input and having an unexpectedly large denominator).

I agree that it's not usually what people ought to use, and I don't think it's an essential part of the API. It might be a useful starting point for the approximation methods though. .trim() and .approximate() in http://svn.python.org/view/sandbox/trunk/rational/Rational.py?rev=40988&view=auto and Haskell's approxRational (http://www.haskell.org/ghc/docs/latest/html/libraries/base/src/Data-Ratio.html) start from rationals instead of floats. On the other hand, it might make sense to provide explicit methods to approximate from floats instead of asking people to execute the two-phase process. I'm happy to give it a different name or drop it entirely.

Likewise, follow decimal's lead in avoiding all automatic coercisions from floats: Rational(3,10) + 0.3 for example. The two don't mix.

I'm reluctant to remove the fallback to float, unless the consensus is that it's always a bad idea to support mixed-mode arithmetic (which is possible: I was surprised by the loss of precision of "10**23/1" while writing this). Part of the purpose of this class is to demonstrate how to implement cross-type operations. Note that it is an automatic coercion _to_ floats, so it doesn't look like you've gotten magic extra precision.

Consider following decimal's lead on having a context that can limit the maximum size of a denominator. There are various strategies for handling a limit overflow including raising an exception or finding the nearest rational upto the max denominator (there are some excellent though complex published algorithms for this) or rounding the nearest fixed-base (very easy). I'll dig out my HP calculator manuals at some point -- they had worked-out a number of challenges with fractional arithmetic (including their own version of an Inexact tag).

Good idea, but I'd like to punt that to a later revision if you don't mind. If we do punt, that'll force the default context to be "infinite precision" but won't prevent us from adding explicit contexts. Do you see any problems with that?

Consider adding Decimal.from_rational and Rational.from_decimal. I believe these are both easy and can be done losslessly.

Decimal.from_rational(Rat(1, 3)) wouldn't be lossless, but Rational.from_decimal is easier than from_float. Then Decimal.from_rational() could rely on just numbers.Rational, so it would be independent of this module. Is that a method you'd want on Decimal anyway? The question becomes whether we want the rational to import decimal to implement the typecheck, or just assume that .as_tuple() does the right thing. These are just as optional as .from_float() though, so we can also leave them for future consideration.

Automatic coercions to/from Decimal need to respect the active decimal context. For example the result of Rational(3,11) + Decimal('3.1415926') is context dependent and may not be commutative.

Since I don't have any tests for that, I don't know whether it works. I suspect it currently returns a float! :) What do you want it to do? Unfortunately, giving it any special behavior reduces the value of the class as an example of falling back to floats, but that shouldn't necessarily stop us from making it do the right thing.

When in doubt, keep the API minimal so we don't lock-in design mistakes.

Absolutely!

Test the API by taking a few numerical algorithms and seeing how well they work with rational inputs (for starters, try http://docs.python.org/lib/decimal-recipes.html ).

Good idea. I'll add some of those to the test suite.

If you do put in a method that accepts floats, make sure that it can accept arguments to control the rational approximation. Ideally, you would get something something like this Rational.approximate(math.pi, 6) --> 355/113 that could produce the smallest rationalal approximation to a given level of accuracy.

Right. My initial plan was to use Rational.from_float(math.pi).simplest_fraction_within(Rational(1, 10**6)) but I'm not set on that, and, because there are several choices for the approximation method, I'm skeptical whether it should go into the initial revision at all.

rhettinger commented 16 years ago

Decimal.from_rational(Rat(1, 3)) wouldn't be lossless

It should be implemented as Decimal(1)/Decimal(3) and then let the context handle any inexactness.

Rational.from_decimal is easier than from_float.

Yes. Just use Decimal.as_tuple() to get the integer components.

Then Decimal.from_rational() could rely on just numbers. Rational, so it would be independent of this module. Is that a method you'd want on Decimal anyway?

Instead, just expand the decimal constructor to accept a rational input.

Regarding float->rational, I propose to refuse Rational(1.1) for the same reasons as Decimal(1.1) is refused,

+1

but to add a separate constructor (a class method?) that converts a float to a rational exactly (as long as it isn't nan or inf), large denominators be damned. This can help people who are interested in taking floating point numbers apart.

Here's how to pick the integer components out of a float:

mant, exp = math.frexp(x) mant, exp = int(mant * 2.0 ** 53), exp-53

> * Likewise, follow decimal's lead in avoiding all > automatic coercions from floats:
> Rational(3,10) + 0.3 for example. The two don't mix.

I'm reluctant to remove the fallback to float,

You will need some plan to scale-down long integers than exceed the range of floats (right shifting the numerator and denominator until they fit).

rhettinger commented 16 years ago

One other thought, the Decimal and Rational constructors can be made to talk to each other via a magic method so that neither has to import the other (somewhat like we do now with __int and __float).

mdickinson commented 16 years ago

Allowing automatic coercion to and from Decimal sounds dangerous and complicated to me. Mightn't it be safer just to disallow this (at least for now)?

I think something like trim() (find the closest rational approximation with denominator bounded by a given integer) would be useful to have as a Rational method, but probably only for explicit use.

I'm still worried about equality tests: is it possible to give a good reason why Decimal("2.5") == Rational(5, 2) should return False, while Rational(5, 2) == 2.5 returns True. Can someone articulate some workable principle that dictates when an int/float/complex/Rational/Decimal instance compares equal to some other int/float/complex/Rational/Decimal instance of possibly different type but the same numerical value?

gvanrossum commented 16 years ago

All in all, Decimal is the odd man out -- it may never become a full member of the Real ABC. The built-in types complex, float and int (and long in 2.x) all support mixed-mode arithmetic, and this is one of the assumptions of the numeric tower -- and of course of mathematics. The new Rational type can be designed to fit in this system pretty easily, because it has no pre-existing constraints; but the Decimal type defies coercion, and is perhaps best left alone. It's already breaking the commonly understood properties of equality, e.g. 1.0 == 1 == Decimal("1") != 1.0.

--Guido

On Jan 9, 2008 7:51 PM, Mark Dickinson \report@bugs.python.org\ wrote:

Mark Dickinson added the comment:

Allowing automatic coercion to and from Decimal sounds dangerous and complicated to me. Mightn't it be safer just to disallow this (at least for now)?

I think something like trim() (find the closest rational approximation with denominator bounded by a given integer) would be useful to have as a Rational method, but probably only for explicit use.

I'm still worried about equality tests: is it possible to give a good reason why Decimal("2.5") == Rational(5, 2) should return False, while Rational(5, 2) == 2.5 returns True. Can someone articulate some workable principle that dictates when an int/float/complex/Rational/Decimal instance compares equal to some other int/float/complex/Rational/Decimal instance of possibly different type but the same numerical value?

Tracker \report@bugs.python.org\ \http://bugs.python.org/issue1682\

rhettinger commented 16 years ago

Would it be reasonable then to not have any of the numeric tower stuff put in the decimal module -- basically leave it alone (no __ceil__, etc)?

gvanrossum commented 16 years ago

Would it be reasonable then to not have any of the numeric tower stuff put in the decimal module -- basically leave it alone (no __ceil__, etc)?

If that's the preference of the decimal developers, sure.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

If the consensus is that Decimal should not implement Real, I'll reply to that thread and withdraw the patch.

Raymond, do you want me to add Decimal.__init__(Rational) in this patch or another issue/thread?

I don't understand the comment about scaling down long integers. It's already the case that float(Rational(10**23, 10**24 + 7))==0.1.

Mark, I agree that .trim() and/or .approximate() (which I think is the same as Haskell's approxRational) would be really useful. Do you have particular reasons to pick .trim? Are those the best names for the concepts? I'd also really like to be able to link from their docstrings to a proof that their implementations are correct. Does anyone know of one?

Finally, Decimal("2.5") != Rational(5, 2) because Decimal("2.5") != 2.5 (so it'd make equality even more intransitive) and hash(Decimal("2.5")) != hash(2.5) so we couldn't follow the rule about equal objects implying equal hash codes, which is probably more serious. I don't have a principled explanation for Decimal's behavior, but given that it's fixed, I think the behavior of non-integral Rationals is determined too. On the other hand, I currently have a bug where Rational(3,1) != Decimal("3") where the hash code could be consistent. I'll fix that by the next patch.

rhettinger commented 16 years ago

If the consensus is that Decimal should not implement Real, I'll reply to that thread and withdraw the patch.

Thanks. That would be nice.

Raymond, do you want me to add Decimal.__init__(Rational) in this patch

How about focusing on the rational module and when you've done, I'll adapt the Decimal constructor to accept a rational input.

I don't understand the comment about scaling down long integers.

My understanding is that you're going to let numerators and denominators grow arbitrarily large. When they get over several hundred digits each, you will have to scale the down before converting to a float. For example when numerator=long('2'*400+'7') and denominator=long('3'*400+'1'), the long-->float conversion will overflow, so it is necessary to first scale-down the two before computing the ratio: scale=325; float_ratio=float(numerator>>scale)/float(denominator>>scale)

mdickinson commented 16 years ago

More comments, questions, and suggestions on the latest patch:

In _binary_float_to_ratio, the indentation needs adjusting. Also 'top = 0L' could be replaced with 'top = 0', unless you need this code to work with Python 2.3 and earlier, in which case top needs to be a long so that \<\< behaves correctly. Otherwise, this looks good.
Rational() should work, and it should be possible to initialize from a string. I'd suggest that Rational(' 1729 '), Rational('-3/4') and (' +7/18 \n') should be acceptable: i.e. leading and trailing whitespace, and an optional - or + sign should be permitted. But space between the optional sign and the numerator, or on either side of the / sign, should be disallowed. Not sure whether the numerator and denominator should be allowed to have leading zeros or not---probably yes, for consistency with int().
I don't think representing +/-inf and nan as Rationals (1/0, -1/0, 0/0) is a good idea---special values should be kept out of the Rational type, else it won't be an implementation of the Rationals any more---it'll be something else.
hash(n) doesn't agree with hash(Rational(n)) for large integers (I think you already mentioned this above).
Equality still doesn't work for complex numbers:

>>> from rational import *
>>> Rational(10**23) == complex(10**23)  # expect False here
True
>>> Rational(10**23) == float(10**23)
False
>>> float(10**23) == complex(10**23)
True

Why the parentheses around the str() of a Rational?
How about having hash of a Rational (in the case that it's not equal to an integer or a float) be given by hash((self.numerator, self.denominator))? That is, let the tuple hash take care of avoiding lots of hash collisions.

mdickinson commented 16 years ago

About .trim and .approximate:

it sounds like these are different, but quite closely related, methods: one takes a positive integer and returns the best approximation with denominator bounded by that integer; the other returns the 'smallest' rational in a given interval centered at the original rational. I guess we probably don't need both of these, but I can't give any good reason for preferring one over the other.

I don't have anything to offer about names, either.

I can try to find out whether the algorithms are published anywhere on the web---certainly, neither algorithm should be particularly hard to implement and prove the correctness of; they both essentially rely on computing the continued fraction development of the given rational.
Almost any not-too-basic elementary number theory text should contain proofs of the relevant results about continued fractions.

Am willing to help out with implementing either of these, if that's at all useful.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

_binary_float_to_ratio: Oops, fixed.

Rational() now equals 0, but I'd like to postpone Rational('3/2') until there's a demonstrated need. I don't think it's as common a use as int('3'), and there's more than one possible format, so some real world experience will help (that is, quite possibly not in 2.6/3.0). I'm also postponing Rational(instance_of_numbers_Rational).

+/-inf and nan are gone, and hash is fixed, at least until the next bug. :) Good idea about using tuple.

Parentheses in str() help with reading things like "%s**2"%Rational(3,2), which would otherwise format as "3/2**2". I don't feel strongly about this.

Equality and the comparisons now work for complex, but their implementations make me uncomfortable. In particular, two instances of different Real types may compare unequal to the nearest float, but equal to each other and have similar but inconsistent problems with \<=. I can trade off between false ==s and false !=s, but I don't see a way to make everything correct. We could do better by making the intermediate representation Rational instead of float, but comparisons are inherently doomed to run up against the fact that equality is uncomputable on the computable reals, so it's probably not worthwhile to spend too much time on this.

I've added a test that float(Rational(long('2'*400+'7'), long('3'*400+'1'))) returns 2.0/3. This works without any explicit scaling on my part because long.__truediv__ already handles it. If there's something else I'm doing wrong around here, I'd appreciate a failing test case.

The open issues I know of are:

Is it a good idea to have both numbers.Rational and rational.Rational? Should this class have a different name?
trim and approximate: Let's postpone them to a separate patch (I do think at least one is worth including in 2.6+3.0). So that you don't waste time on them, we already have implementations in the sandbox and (probably) a good-enough explanation at http://en.wikipedia.org/wiki/Continued_fraction#Best_rational_approximations. Thanks for the offer to help out with them. :)
Should Rational.from_float() exist and with the current name? If there's any disagreement, I propose to rename it to Rational._from_float() to discourage use until there's more consensus.
Rational.from_decimal(): punted to a future patch. I favor this for 2.6+3.0.
Rational('3/2') (see above)

I think this is close enough to correct to submit and fix up the remaining problems in subsequent patches. If you agree, I'll do so.

mdickinson commented 16 years ago

The latest version of rational.py looks good to me---nice work! (I haven't looked properly at numbers.py or test_rational.py, yet. I do plan to, eventually.)

I do think it's important to be able to create Rational instances from strings: e.g., for easy reading from and writing to files. But maybe I'm alone in this opinion. You say there's more than one possible format---what other formats were you considering?

And since you pointed it out, I think Rational(Rational(a, b)) should work too.

There's also the not-entirely-insignificant matter of documentation :)

Other than that, I don't see why this shouldn't go in.

Other comments:

I have a weak preference for no parentheses on the str() of a Rational, but it's no big deal either way.

I agree that equality and comparisons are messy. This seems almost inevitable: one obvious cause is that the existing int \<-> float comparisons already break the `numeric tower' model (push both operands to the highest common type before operating). So I'm not sure there can be an easy and elegant solution here :(

I like the name Rational for this class. Maybe change the name of numbers.Rational instead?

Postponing trim, approximate, from_decimal sounds fine to me.

Finally: the very first line of rational.py is, I think, no longer accurate. Please add your name so everyone knows who to blame/credit/assign bug reports to :)

mdickinson commented 16 years ago

Just had a quick look at numbers.py. Only two comments:

I think there's a typo in the docstring for Inexact: one of those == should be a !=
Not that it really matters now, but note that at least for Decimal, x-y is not the same as x+(-y) (I'm looking at Complex.__sub__), and +x is not a no-op (Real.real, Real.conjugate). In both cases signs of zeros can be problematic:

>>> x = Decimal('-0')
>>> y = Decimal('0')
>>> x-y
Decimal("-0")
>>> x+(-y)
Decimal("0")
>>> x
Decimal("-0")
>>> +x
Decimal("0")

Of course the first point wouldn't matter anyway since Decimal already implements __sub__;
the second means that if Decimal were to join Real, something would need to be done to avoid Decimal("-0").real becoming Decimal("0").

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

Thanks! I've added some minimal documentation and construction from other Rationals. The other formats for string parsing center around where whitespace is allowed, and whether you can put parens around the whole fraction. Parens would, of course, go away if str() no longer produces them. So they're not significantly different. I guess my objection to construction from strings is mostly that I'm ambivalent, and especially for a core library, that means no. If there's support from another core developer or two, I'd have no objections.

Inexact is saying that one thing could be ==3 and the other ==0, so I think it's correct.

Negative zeros are a problem. :) I think the default implementations are still fine, but you're right that classes like Decimal will need to think about it, and the numbers module should mention the issue. It's not related to the Rational implementation though, so in another patch.

I've submitted this round as r59974. Onto the next patch! :)

mdickinson commented 16 years ago

Inexact is saying that one thing could be ==3 and the other ==0, so I think it's correct.

You're right, of course :)

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

Here's a patch that adds construction from strings (Guido favored them) and .fromdecimal(), changes \_init to __new to enforce immutability, and removes "rational." from repr and the parens from str. I don't expect this to be contentious, so I'll commit it tomorrow unless I hear objections.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

After this come the two approximation methods. Both are implemented using the continued fraction representation of the number: http://en.wikipedia.org/wiki/Continued_fraction#Best_rational_approximations. The first, currently named "trim", takes the closest rational whose denominator is less than a given number. This seems useful for computations in which you want to sacrifice some accuracy for speed. At one point in this discussion, Guido suggested that Decimal removed the need for a method like this, and this type isn't optimized for speed anyway.

The other, currently named "approximate", returns the "simplest" rational within a certain distance of the real value. This seems useful for converting from float and displaying results to users, where we prefer readability over accuracy ("yes, I'll take '1/3' even though it's farther away than '1234/3690'.").

We could provide 0, 1, or both of them, or an accessor for the continued fraction representation of the number, which might help with third-party implementations of both. I've never actually used either of these, so I can't say which is actually more useful. It's probably a good question to send to the full python-dev list. Even if we decide against including them in the class, we might put the implementations into the docs or the test as a demonstration.

mdickinson commented 16 years ago

(About the latest patch): this all looks good to me.

The comment that "Decimal provides no other public way to detect nan and infinity." is not true (though it once was). Decimal has public methods is_nan and is_infinite, added as part of updating to the most recent specification. (Yes, it also has private methods _isnan and _isinfinity, dating from long ago; I'm working on a patch that gets rid of the duplication.)

(About the approximation methods): I agree that these aren't a necessary part of a Rational module---just something that might be nice to have around. So my vote would be for adding either 0 or 1 of these; adding two such similar methods with similar use-cases just seems like a cause of possible confusion to me. I'd also vote against a method for providing the convergents of the continued-fraction, but that's just me. See what python- dev says!

One interesting use-case for approximate is to recover a simple rational from a float, in a case where the float was rational to begin with, but lost a little accuracy in conversion; approximate works well here because you generally have some idea how close the float is to the rational.

rhettinger commented 16 years ago

Were the new methods part of the spec update? If so that's great. If not, we need to take them out. We want zero API creep that isn't mandated by the spec (no playing fast and loose with this module).

mdickinson commented 16 years ago

Raymond:

Were the new methods part of the spec update? If so that's great.

Yes. See http://www2.hursley.ibm.com/decimal/damisc.html

If not, we need to take them out. We want zero API creep that isn't mandated by the spec (no playing fast and loose with this module).

Understood.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

I coulda sworn I looked for is_nan and is_infinite. Oh well, switched to using .is_finite, which is even better and checked in as r60068. Thanks for the pointer.

mdickinson commented 16 years ago

I noticed that the ability to type Rational("2.3") has been added (and I think this is a very useful addition). One minor nit: would it make sense to have Rational("2.") and Rational(".3") also work? These strings work for float() and Decimal(), and 2. and .3 work as float literals, so it would seem to make sense to allow this for Rational as well.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

Whoops, sorry for taking a while to answer. +0 on adding support for '2.' and '.3', given that they're allowed for float and Decimal. Not +1 because they'll make the regular expression more complicated, and they're not exactly necessary, but if you want to add them, go ahead.

mdickinson commented 16 years ago

About Rational('3.') and Rational('.2'):

It's not so much to do with consistency with float and Decimal; more to do with the fact that some people like to write floats these ways (which presumably is why float and Decimal allow such input).

It occurs to me that if you also allow things like Rational('1e+100') then conversions from Decimal to Rational could simply be spelled

Rational(str(decimal_instance))

eliminating the need for the special Decimal -> Rational conversion method.

Anyway, I'll change the regex to allow '3.' and '.3'.

mdickinson commented 16 years ago

Regex changed in r60530.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

I think '1e+100' would constitute feature creep at this point, but thanks for the suggestion, and the improvement in the readability of the regex!

rhettinger commented 16 years ago

The rational constructor should accept decimals directly.

Rational(Decimal('3.1')) does not suffer the same representation error problems as Rational(float('3.1')). The conversion from decimal is lossless and does not depend on the decimal context. There is no need for a separate constructor.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

I think Rational should handle all floating types as consistently as possible, whatever their radix or precision. It's unlikely that people will convert from them often anyway, especially from Decimal, so the shorter conversion from Decimal doesn't seem to outweigh the extra complexity in the constructor's behavior. If I turn out to be wrong about this, we can revisit it in 3.1.

gvanrossum commented 16 years ago

FWIW, if Rational(Decimal(...)) is to be accepted, then Decimal(Rational(...)) should also be accepted, and arguably mixed binary operations as well (Rational(...) + Decimal(...) etc.).

rhettinger commented 16 years ago

I would rather drop it than see that mess.

FWIW, there is a difference. Rational(Decimal(...)) takes place without reference to a decimal.Context and is always lossless.

In contrast, Decimal(Rational(...)) is context sensitive (the division is subject to rounding and precision limits) and is typically lossy as would be the case with Decimal(Rational(1, 3)) which like most rationals cannot be exactly represented in Decimal.

facundobatista commented 16 years ago

I'm +0 to make Decimal(Rational(x,y)) available.

I'd make it something like:

Decimal(R) == Decimal(R.numerator) / Decimal(R.denominator)

Note, as Raymond says, that this division implies the use of the context. So the following will NOT be true:

R == Rational(Decimal(R))

mdickinson commented 16 years ago

from Guido:

I have one minor nit on the current rational.py code: all internal accesses to the numerator and denominator should use self._numerator and self._denominator -- going through the properties makes it *much* slower. Remember that Python function/method calls are slow, and never optimized away. :-)

This isn't quite as simple as doing s/.numerator/._numerator, since the current code makes use of the fact that the int and long types also implement .numerator and .denominator.

Can we follow the approach that Decimal takes: convert subclasses of int and long to Rational before operating? At first sight it seems possible that this might actually slow down code that does a lot of mixed-mode int/long + Rational arithmetic, but I think this is unlikely.
I'll implement this unless there are objections.

I'm also wondering what the policy should be on return types: if a and b are instances of a subclass of Rational, should a+b have return type Rational, or return type equal to that of a and b? Current behaviour of various builtin types and Decimal suggests that a Rational should be returned.

gvanrossum commented 16 years ago

> I have one minor nit on the current rational.py code: all internal > accesses to the numerator and denominator should use self._numerator > and self._denominator -- going through the properties makes it *much* > slower. Remember that Python function/method calls are slow, and never > optimized away. :-)

This isn't quite as simple as doing s/.numerator/._numerator, since the current code makes use of the fact that the int and long types also implement .numerator and .denominator.

Well, but *self.numerator* certainly doesn't need to worry about self being an int or long. :-)

Can we follow the approach that Decimal takes: convert subclasses of int and long to Rational before operating? At first sight it seems possible that this might actually slow down code that does a lot of mixed-mode int/long + Rational arithmetic, but I think this is unlikely. I'll implement this unless there are objections.

It had never occurred to me to do it otherwise. ;-)

I'm also wondering what the policy should be on return types: if a and b are instances of a subclass of Rational, should a+b have return type Rational, or return type equal to that of a and b? Current behaviour of various builtin types and Decimal suggests that a Rational should be returned.

Correct -- the thing is that you can't know the signature of the subclass's constructor so you can't very well blindly call that.

5969680d-a065-4b5f-9706-8216692d995a commented 16 years ago

I figured I'd time the difference before we change the code:

$ ./python.exe -m timeit -s 'import rational; r=rational.Rational(3, 1)'
'r.numerator'
1000000 loops, best of 3: 0.696 usec per loop
$ ./python.exe -m timeit -s 'import rational; r=rational.Rational(3, 1)'
'r._numerator'
10000000 loops, best of 3: 0.155 usec per loop
$ ./python.exe -m timeit -s '(3).numerator'
10000000 loops, best of 3: 0.0324 usec per loop
$ ./python.exe -m timeit -s '(3L).numerator'
10000000 loops, best of 3: 0.0356 usec per loop
$ ./python.exe -m timeit -s 'from rational import Rational' 'Rational(3,
1)'  
10000 loops, best of 3: 80.4 usec per loop
$ ./python.exe -m timeit -s 'from rational import Rational;
r=Rational(3, 1)' 'isinstance(r, Rational)'
100000 loops, best of 3: 10.6 usec per loop

So every time we change .numerator to ._numerator we save half a microsecond. If we construct a new Rational from the result, that's totally drowned out by the Rational() call. Even checking whether we're looking at a Rational costs 20 times as much as the difference, although that can probably be optimized. I think this means that we shouldn't bother changing the arithmetic methods since it makes the code more complicated, but changing unary methods, especially ones that don't return Rationals, can't hurt.

mdickinson commented 16 years ago

Guido:

Correct -- the thing is that you can't know the signature of the subclass's constructor so you can't very well blindly call that.

Jeffrey, is it okay for me to change the three class methods (from_float, from_decimal, from_continued_fraction) to static methods, and call Rational instead of cls as the constructor?

gvanrossum commented 16 years ago

Jeffrey:

Yay for measurements!

I was going to say that __add__ is inefficient because it makes so many function calls, but it turns out that, as you say, the cost of constructing a new Rational instance drowns everything else. On my 2.8GHz iMac, Rational(2,3) costs \~80 usec. This goes down to 50 usec if I make it inherit from object -- the ABC machinery costs 30 usecs! If I then also comment out all the typechecking of numerator and denominator and go straight into the gcd(), the constructor costs go down to 6 (six!) usecs. Beyond that it's slim pickings; replacing super() with object or inlining gcd wins perhaps half an usec. But once the constructor is down to 5-6 usec, the half usec for going through the constructor (times 6 for 6 constructor calls in _add()!) might be a significant gain to also try and inline the common case of the binary operators.

In the mean time I have two recommendations if you want to make the constructor faster without losing functionality: (a) remove the direct inheritance from RationalAbc (using virtual inheritance instead); (b) special-case the snot out of the common path in the constructor (called with two ints).

An alternative might be to have a private class or static method to construct a Rational from two ints that is used internally; it could use object.__new__ instead of super() so as to save the ABC overhead. But I'm not sure if this will always work.

Unrelated issue: I just realized for the first time that we have two classes named 'Rational': the ABC and the concrete implementation. That's going to be awfully confusing. Perhaps it's not too late to rename rational.py/Rational to fractions.py/Fraction?

python / cpython

Move Demo/classes/Rat.py to Lib/fractions.py and fix it up. #46023