Open pgdr opened 2 years ago
Hi! Thank you for your interest in the Algorithm Archive, as I see this is your first contribution to our repo. While I recognize our stable problem code is not the most efficient, I don't think it is a big problem just yet. Could you please provide a fair comparison between the two algorithms? This would mean:
hyperfine
to compare the two version easily.If you have done all that, I think you can open a PR, in which we'll do a code review. As the code you provide in the issue is confusing (in my opinion), we'll definitely need to improve it before we can merge it in.
Checking that the results are the same in (nearly) all possible cases (as the current Python version should have the correct behaviour)
The results are identical. They both implement the Gale–Shapley Algorithm, and interestingly, any G–S algorithm implementation has a unique output (regardless of execution order). I have confirmed that the produced matching is the same in both algorithms.
Provide fair performance-comparing code, so we can use tools like
hyperfine
to compare the two version easily.
(e39) [~/.../python]$ hyperfine --warmup 3 "python sm.py 1000" "python stable_marriage.py 1000"
Benchmark 1: python sm.py 1000
Time (mean ± σ): 745.5 ms ± 5.8 ms [User: 721.9 ms, System: 23.2 ms]
Range (min … max): 739.6 ms … 758.7 ms 10 runs
Benchmark 2: python stable_marriage.py 1000
Time (mean ± σ): 45.998 s ± 1.857 s [User: 45.985 s, System: 0.011 s]
Range (min … max): 43.186 s … 49.517 s 10 runs
Summary
'python sm.py 1000' ran
61.70 ± 2.54 times faster than 'python stable_marriage.py 1000'
sm.py
is the quadratic time algorithm attached above, and 1000 means that N = 1000
in every run.
They both used the same seed in all ten runs. If necessary, I can try to get hyperfine
to set different seed in each run.
The results are identical. They both implement the Gale–Shapley Algorithm, and interestingly, any G–S algorithm implementation has a unique output (regardless of execution order). I have confirmed that the produced matching is the same in both algorithms.
OK, good. We won't have to check too much, then.
The thing I don't really understand is where you pull stable_marriage.py
accepting an argument, which we don't have in the AAA. If that's our code that you modified, you need to provide the modifications you made, because, right now, the code just can't accept more than 26 people due to the data representation we chose.
While in general I'd prefer fast code, in this case I'm not sure if that's necessary, as the AAA has a mainly pedagogical value, although it should definitely talk about performance in a dedication section.
EDIT: What I failed to mention is that we have to assume people don't have a background in programming or algorithms, so clarity is key, sometimes even at the cost of performance.
This looks good to me. To be honest, the stable marriage chapter is one of the few that need a major overhaul. There does not seem to be any standard implementation that spans across different languages, so the python version is radically different than the Julia one (for example).
Right now, I find the current python implementation a bit harder to read than the one you provided, but we might need to work on it to standardize it across the other languages as well.
you need to provide the modifications you made
Here is the modification that allowed me to run it for an arbitrary number of pairs
-def main():
+def main(num_pairs):
# Set this to however many men and women you want, up to 26
- num_pairs = 5
# Create all Person objects
- men = [Person(name) for name in ascii_uppercase[:num_pairs]]
- women = [Person(name) for name in ascii_lowercase[:num_pairs]]
+ men = [Person(f"M{name}") for name in range(num_pairs)]
+ women = [Person(f"W{name}") for name in range(num_pairs)]
# Set everyone's preferences
for man in men:
@@ -136,4 +135,5 @@ class Person:
if __name__ == '__main__':
- main()
+ import sys
+ main(int(sys.argv[1]))
AAA has a mainly pedagogical value
I realize that, and I think that's a good thing. I just don't know if the current SM implementation is optimal in that regard either. I think the OOP based approach to a relatively simple algorithm makes it more difficult to see what's going on.
The stable marriage algorithm works as follows:
while some man is unmatched:
propose to best woman you haven't proposed to yet
woman accepts if unmatched or this man is better than her current
I think that an algorithm in the AAA should be as close in spirit as possible, to that one.
A different issue is that I think that more modern expositions of the stable matching problem uses "hospitals" and "students", rather than "men" and "women". But that's a different discussion.
stable marriage chapter is one of the few that need a major overhaul
I could also take a shot at that while I'm at it (if I'm creating a PR anyway).
Pps: I don't think it belongs under decision problems (that are typically yes/no). The Stable Marriage problem always has a solution. Perhaps it would be better suited under a flows & matchings category?
Yeah, I agree with the points here. Also happy to change terminology to be stable matching instead of stable marriage (and also take down my youtube video on the topic). At the time, I had not seen the stable matching terminology. I wish I had because it kinda caused a bit of a rift in the community and I didn't know how to fix it.
I think I would ideally like to revamp the entire chapter to be more inline with current standards.
@pgdr Yep, I'm okay with those modifications for benchmarking, although we don't typically read from stdin (because of an ongoing-ish effort for standardization of inputs and outputs).
I also agree that OOP might not be the best approach, although it was the approach I went for when I wanted to improve it (might have been misguided). It is also possible that output to stdout takes a long time, since you're writing 3000 lines (for N=1000). A quick test without output gave me around 30 seconds with the current version (N=1000), so about 3 times less than what you got, but still significant.
If both of you want a revamp of the chapter, we'll need to rewrite all the implementations anyway, so might as well do a compromise solution (your code, but a bit less confusing).
After a "quick" profile, these two are the longest running functions:
python3 -m cProfile stable_marriage.py 1000
250785690 function calls (250785602 primitive calls) in 955.653 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
[...]
238405442 425.530 0.000 425.530 0.000 stable_marriage.py:104(partner)
[...]
1487000 484.683 0.000 909.195 0.001 stable_marriage.py:89(pick_preferred)
So the method call overhead seems to be part of the problem, since the time per call of those two functions is really low.
After a "quick" profile, these two are the longest running functions
My point is that the worst-case complexity of the current implementation is ϴ(n³), so it's not really the method call overhead that's doing it, it's just the sheer number of operations. When N = 3000, we have something like 1010 many "operations", and if each "operation" takes at least a couple of CPU cycles, then time starts to fly rather quickly. So I don't think there's any point in profiling the implementation.
Indeed, just looking at the number of function calls (for N = 3000):
Current implementation:
2833248468 function calls (2833248370 primitive calls) in 527.844 seconds
Alternative implementation:
62561677 function calls (62561636 primitive calls) in 12.986 seconds
That's 2833M vs 62M function calls.
My point is that the worst-case complexity of the current implementation is ϴ(n³), so it's not really the method call overhead that's doing it, it's just the sheer number of operations.
Yeah, that's what I meant but didn't articulate properly. The time cost is dominated by method call (and profiling) overhead due to the number of operations. Seems like the mean time complexity is close to ϴ(n³) as well.
It's been a while since I computed complexity, so I'll have to take some time to do that on my own
Okay, I see where that comes from: a linear search (pick_prefered
) in a double loop... yep, obviously O(n³). With the current architecture, it will be basically impossible to go below O(n² log n), so we should not use it unless it's very superior in terms of clarity compared to any improvement.
We'll see when we get around to rewriting the chapter. If @leios is comfortable enough letting you do this chapter rewrite, then please do take a few days to do it, and we'll review both the new code and the new chapter fairly. Although that has historically been a very long process. Otherwise, @leios should (probably) get some AAA work done before Oct 15, so feel free to be part of the chapter review process, and contribute Python code when the chapter is merged.
Bug Report
Description
The Python implementation in the stable marriage chapter is worst case ϴ(n³), but should be O(n²).
I have attached an alternative implementation that runs in worst-case quadratic time. I can make a PR if you want to replace the current algorithm.
Steps to Reproduce
Run the code with N = 3000.
Expected behavior
Terminate within seconds.
Screenshots
Additional context
For Algorithm Archive Developers