Closed arnavs closed 6 years ago
I believe this was discussed in the original PR as well, where I think Antoine asked why set it to zero. I'm guessing to have the default do the most simple thing? Either way, it's just a matter of properly documenting it in my mind.
Yeah, I think we discussed this there as well. The thing, though, is that most users won't dig into the docs... they'll just want to use the fixedpoint()
function, and get dismayed that it's 2X or 4X as slow as the naive iteration they'd write.
If it's not too much trouble (and it seems that we're on the same page/most people use m > 0
in their own work?), then I think this would be a good patch.
Do you have some examples of such timings I can play with?
@arnavs I feel like we'll never get to performance parity with something as simple as iterate!
in the test file, but check out the changes in the m0special branch if you want to get rid of some of the overhead for very simple problems.
I think there are two things here.
First, the default settings for m
should almost always be greater than 0. I think this probably should apply to solve
as well as fixedpoint
... We want it fast out of the box, and this gain is free! People told me that the m
were usually set around 10, if I recall, so going with something like 7 ish may be appropriately conservative. This is all arnav was trying to do, so that the people using it (eg QuantEcon lectures) show it dominates simple iteration out of the box.
The second issue is people who might compare their handwritten iteration to the m=0 case, in which case I found a performance degradation of about 2 to 4.
I think that should be handled separately, but is lower priority (since I see few reasons to ever use m=0 in practice
I think it’s fine to use the standard m
for solve
, I wasn’t the one who argued for 0 in the first place :)
about 2 to 4
2 is fine, 4 is annoying, that's why I made the changes in the m0special branch
@jlperla but I should add that I'm talking about those small toy problems of 1 or 2 dimensional problems. If you have any fixed point problems with actual computational costs in there (say high dimensional linear algebra) I'd be very surprised and sad to see a factor 4 difference. That should be reported.
No, that 2-4X slowdown for the m=0 case was when comparing against solving a bellman equation with maybe 50-60 dimensions when I tried it last.
This shouldn't be too surprising. When m=0
, it is exactly doing a nested iteration of the function, but it keeps around a huge amount of overhead and infrastructure you require for the m > 0
case. The caching, etc. I bet if you wrote a m=0
specific case which didn't use that baggage, then the differences would disappear... but it isn't the highest priority since we wouldn't really want people to use m=0
@pkofod Just to confirm, is there a reason not to merge this PR and the other one (with the !inplace
). It would be good to reduce baggage on m=0
Anderson acceleration for sure, but I'm confused as to whether or not you think these are acceptable changes?
I bet if you wrote a
m=0
specific case which didn't use that baggage, then the differences would disappear... but it isn't the highest priority since we wouldn't really want people to usem=0
did you check my m0special
branch that I mentioned a few times in this thread?
Just to confirm, is there a reason not to merge this PR and the other one (with the
!inplace
).
not beyond the usual grace period for PRs where people are allowed to give comments that you and I might not have thought about
@pkofod OK. Thanks. To be clear, I think special logic for m=0
is definitely a good thing, but I just wasn't sure why these were still open.
In any case, let's leave this open so people can weigh in (if they choose), and then hopefully wrap these up soon.
but I just wasn't sure why these were still open.
Well, one PR was opened a day ago and one was opened two hours ago. Keep in mind that these are breaking changes, and neither PR has any updates to README.md
or any tests. I could merge it, but at the risk of having to do these things myself. I appreciate the effort, enthusiasm, and advertisement value, but the PRs have to be somewhat full-bodied.
I know internet communication can be difficult, so some might read the paragraph above as me being annoyed - I'm not! I just can't tag a breaking version of the package without proper adjustments to the "docs" (the readme) and the tests. One liners correlates with code debt whether we might like it or not.
@pkofod Just to confirm, are these "breaking" in the sense that tests are failing? Because that's not something I observed.
You're correct that the README
should be updated.
Just to confirm, are these "breaking" in the sense that tests are failing? Because that's not something I observed.
No, but a change that changes the behavior of someone's program is a breaking change in the sense that it requires a major version bump according to semver. I don't think these changes can be categorized as bug fixes, so they're breaking.
But let me clarify: I don't mind at all to do a major version bump for these changes.
@pkofod OK, understood. I'll flesh out these PRs and hopefully we can move forward soon.
Would be good to set the default memory parameter to something where this has an advantage over the naive Picard/Richardson-style iteration. I played around with the damping factor a bit, but couldn't achieve any strict improvements.
cc: @jlperla