bertdobbelaere / SorterHunter

An evolutionary approach to find small and low latency sorting networks
MIT License
55 stars 4 forks source link

Any Estimation of Runtime and Compute? #5

Closed ZhongxiaYan closed 2 years ago

ZhongxiaYan commented 2 years ago

Hey, very cool work! I'm curious about the runtime and computation that you've been using to find these upper bounds for different sizes. If you have timing and number of CPU for any or all of these, it would be cool to share! Thanks in advance!

bertdobbelaere commented 2 years ago

Hi, unfortunately I didn't keep track of running times for individual networks. I estimate the total running time for all my findings at about 4 months on my venerable Core i7 4770K (running on average 6 instances in parallel). The time it took to find the networks depends a lot on the network size, the size of the prefix and other parameters in the config file. A few results were obtained in seconds, others took weeks. I just ran a benchmark on an example to give you an idea: it takes about 7 seconds on average to find a 91 element network for n=20 (same machine, single thread, ran it 10 times), starting from a 2 layer fixed prefix. See contents of benchmark config below. A network of the same size was independently discovered by Hormoz Shahrzad et al. (see https://arxiv.org/pdf/1906.04050.pdf) and took months to converge. Of course it is not really fair to compare a dedicated program with a demonstration of a more generalistic approach.

Start of benchmark config file

Ninputs=20 Symmetric=1 RestartRate=5000000 EscapeRate=100 MaxMutations = 1 WeigthRemovePair = 1 # Remove a random pair WeigthSwapPairs = 1 # Swap two random pairs WeigthReplacePair = 0 # Replace random pair with another pair WeightCrossPairs = 1 # Cross two pairs at random positions WeightSwapIntersectingPairs = 1 # Swap pairs in neighbouring layers sharing a connection WeightReplaceHalfPair = 1 # Replace one of the two connections of a random pair PrefixType = 1 FixedPrefix=(0,1),(2,3),(4,5),(6,7),(8,9),(10,11),(12,13),(14,15),(16,17),(18,19),(0,2),(1,3),(4,6),(5,7),(8,10),(9,11),(12,14),(13,15),(16,18),(17,19)

End of benchmark config file

ZhongxiaYan commented 2 years ago

Wow I see, that's really cool. Thanks for sharing these results :)

ZhongxiaYan commented 2 years ago

Hm, is there any chance that you could share the prefixes (and also other configs if possible) for the different sized n's? I'm trying to run n=19 now and it seems to take much longer than 20 and still is quite far away from the upper bound. I presume some of this could be due to lack of symmetry in the odd cases. But still, in general, what's a good way to determine what prefix (size and/or exact pairs) to use, or do you typically use the greedyA approach (in that case you still need to pick a size to use)?

ZhongxiaYan commented 2 years ago

Ah I see, looks like at least part of the reason is that I didn't set Symmetric=0 for N=19. It gets to 85 now with the FixedPrefix after I made the change

bertdobbelaere commented 2 years ago

Indeed, Symmetric=1 for odd number of inputs makes no sense. With the most recent commit this switch is ignored for odd N. I added an additional example config as well. Unfortunately I didn't keep a database with all used configs so far. A few empirical rules for good results:

ZhongxiaYan commented 2 years ago

These are great insights, wow! Really appreciate the visualizations, depth layers, and notes on your website too https://bertdobbelaere.github.io/sorting_networks.html . Very nicely done! May I ask how long you've been working on this project (excluding the 4 month time for running the code)? Edit: ah I see your improvement history now, still curious how many hours it took to develop these ideas though!

bertdobbelaere commented 2 years ago

Thanks very much! Years ago, I stumbled accidentally on a title of a paper mentioning sorting networks; I was just curious what it was all about and started looking into the subject. I wrote a small Python program back then to test my ideas for heuristic optimization, and found an improvement for the n=20 case. I rewrote it in C++, found a few more networks and decided to share my results on Github. I return to the subject every now and then if I see some new paper, receive a question or have some idea for improvement. Early this year, I updated the program and extended the list of networks (on suggestion of Lord Control) to <=64 inputs. Hard to estimate, but I think I easily spent 200+ hours on this altogether.

ZhongxiaYan commented 2 years ago

Heh I see, I'd say 200 hours is quite impressively small for how good your code's performance is. I read thru it and found that you used quite a lot of tricks here and there. Definitely worthy of 200+ hours