Open pbj4 opened 1 year ago
I did some testing on my FX-8350 (mostly "y" revision, but the "z" seems similar) and got the following timing results (with updated numbers from mine for comparison): N=12: 8s (9s) N=13: 58s (63s) N=14: 437s (546s)
Indeed it seems your methods are faster, and it seems the scaling is better as N increases. I also find it interesting this was possible while checking reflections at the same time. I do wonder how much of the speedup was due to improvement of the removal check versus other factors. I did notice revision "v" is 13.6s vs "p" at 23.29s (N=12), so perhaps that might indicate part of the speed difference - though I'm sure there's some other optimizations you're doing somewhere in there.
I'm not sure if I'm interpreting what you're doing correctly, but are you only propagating the "4D-rotation" unique polycubes and then checking for symmetry to back-calculate the "3D-rotation" unique polycubes? I could see why that would scale better to larger N values if that's the case, and I would expect it to be the main difference in speed. I was also thinking it might be interesting to do something similar with the "translation" unique polycubes, though I'm not sure how that could be done efficiently.
From what I can see, the big thing missing is the ability to read / write some sort of output file, or otherwise some way to feed a known set of starting polycubes to generate segments for distributed jobs. With that in place, I would expect N=19 or N=20 could be quite feasible.
@snowmanam2 Thank you for your feedback. Your interpretation is correct.
I've written the distributed system now and used it to run n = 17 in 8h37m, and it agrees with the 3d rotation count of 457409613979
as well as generating the 4d rotation count of 228779330204
.
@hidny seems interested in this version of the problem as well. I think there's a high probability my count is correct since the one derived from it agrees with the other implementations, but it would be good for someone else to confirm as well.
Unfortunately, I won't be able to confirm your number anytime soon. My java implementation is pretty slow compared to the Rust version of it and it would take me around 2 weeks of CPU time to get the number. I also have no experience with Rust. I might eventually experiment with the Rust code and get it that way, but that won't be any time soon.
@snowmanam2 Thank you for your feedback. Your interpretation is correct.
I've written the distributed system now and used it to run n = 17 in 8h37m, and it agrees with the 3d rotation count of
457409613979
as well as generating the 4d rotation count of228779330204
.@hidny seems interested in this version of the problem as well. I think there's a high probability my count is correct since the one derived from it agrees with the other implementations, but it would be good for someone else to confirm as well.
Hi all First time posting on this although I have been following it for a couple of months after I saw Mike's first video.
I have been doing some work on the processing side, not the coding or algorithms and wanted to know what is best to feedback/upload. I have migrated the RUST implementation to Windows as I have access to a few machines with that OS. Also optimised it on Mint. Definitely getting a performance boost. anywhere from 10% to 50% although I am dubious on the higher increase as that may be cached data. Running in a VM but 20 threads does work well. Servers on Intel are terrible though (or it may be MS VS crap compiler) Also good results in a VM on my PC. an oc'd AMD R7 5700X. And AMD multi-threading is better for these sort of calculations.
So wonder if you can get back to me @pbj4 to see if what I am working on can help?
Thanks
Tony
Some results...
Fresh compile on Intel server - Windows 11 VM Hyper-V on Server 2022
c:\Scripts\polycubes-single>%CD%\target\x86_64-pc-windows-msvc\release\polycubes 12 enumerating up to n = 12... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 total time: 3.3252505s performance: 6472882 r/s, 3265525 p/s
c:\Scripts\polycubes-single>%CD%\target\x86_64-pc-windows-msvc\release\polycubes 13 enumerating up to n = 13... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 total time: 23.9943311s performance: 6667683 r/s, 3349634 p/s
c:\Scripts\polycubes-single>%CD%\target\x86_64-pc-windows-msvc\release\polycubes 14 enumerating up to n = 14... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 n: 14, r: 1039496297, p: 520878101 total time: 193.2894437s performance: 6205630 r/s, 3110621 p/s
c:\Scripts\polycubes-single>%CD%\target\x86_64-pc-windows-msvc\release\polycubes 15 enumerating up to n = 15... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 n: 14, r: 1039496297, p: 520878101 n: 15, r: 7859514470, p: 3934285874 total time: 1538.2618343s performance: 5889112 r/s, 2948481 p/s
Update - heres my AMD - not on a VM - compiled exe: F:\VirtualMachines\ISOs>polycubes 14 enumerating up to n = 14... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 n: 14, r: 1039496297, p: 520878101 total time: 91.5285238s performance: 13105017 r/s, 6568994 p/s
And heres the "server" - compiled exe - not a VM -
C:\Scripts\Polycubes>polycubes 14 enumerating up to n = 14... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 n: 14, r: 1039496297, p: 520878101 total time: 191.1922358s performance: 6273700 r/s, 3144742 p/s
Updated the Linux optimisation - and also implemented another tweak - that definitely has a positive impact LinuxMint VM on my AMD 7 5700X
enumerating up to n = 12... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 total time: 1.473500887s performance: 14607358 r/s, 7369314 p/s
enumerating up to n = 13... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 total time: 11.122578489s performance: 14383949 r/s, 7226043 p/s
enumerating up to n = 14... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 n: 14, r: 1039496297, p: 520878101 total time: 88.986149132s performance: 13479433 r/s, 6756673 p/s
enumerating up to n = 15... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 n: 14, r: 1039496297, p: 520878101 n: 15, r: 7859514470, p: 3934285874 total time: 676.740578156s performance: 13386218 r/s, 6702030 p/s
a small improvement on the linux compile again. But the 16 calculation had an error earlier when I tried. Probably the overclocked memory or CPU at constant 100% skipped a cycle.
enumerating up to n = 15... results: n: 1, r: 1, p: 1 n: 2, r: 1, p: 1 n: 3, r: 2, p: 2 n: 4, r: 8, p: 7 n: 5, r: 29, p: 23 n: 6, r: 166, p: 112 n: 7, r: 1023, p: 607 n: 8, r: 6922, p: 3811 n: 9, r: 48311, p: 25413 n: 10, r: 346543, p: 178083 n: 11, r: 2522522, p: 1279537 n: 12, r: 18598427, p: 9371094 n: 13, r: 138462649, p: 69513546 n: 14, r: 1039496297, p: 520878101 n: 15, r: 7859514470, p: 3934285874 total time: 634.009750047s performance: 14288419 r/s, 7153732 p/s
Added my suggestions, optimisations and tutorials to @pbj4 update - https://github.com/pbj4/polycubes/issues/1
Let me know your thoughts please
Thanks
Tony
It's possible to count the 3d rotation polycubes when generating the 4d rotation polycubes using the following relationship:
I have a repo that implements this idea, #11, #49, and a streaming articulation point algorithm. The performance seems competitive, but I would appreciate if someone else could benchmark it as well. For n = 12 my laptop takes 5.6s to run mine, 7s to run #49 with ./polycube_generator 12 -a, 11.2s to run #27, and 39.6s to run the current rust version with cargo run -r -- enumerate -m hashless 12 and a n = 11 cache file.