Solution where memory usage isn't an issue

dzamlo commented 1 year ago

@hidny has come up with a solution where memory usage isn't the limitation. It also appears faster. You can find my implementation in https://gitlab.com/dzamlo/polycubes2

Unfortunaly, I cannot make a pull request because my laptop died.

datdenkikniet commented 1 year ago

That's really cool!

Do you think it's possible to parallelize this solution? From a hunch it feels like it should be quite parallelizable given that it's all math-based and stuff :D

Also: what does it use nightly for? I think using lazy_static will make it compile just fine on stable. If it could be a bit more idiomatic that would also be nice, now it just looks like a C-project in a rust jacket :P

dzamlo commented 1 year ago

Do you think it's possible to parallelize this solution?

I thinks it's quite easy at the cost of some cloning.

what does it use nightly for?

For Lazylock, that could be replaced by lazy_static, and for some stuff in the const fn, that could also be replaced by lazy_static but the perf should be measured

If it could be a bit more idiomatic that would also be nice, now it just looks like a C-project in a rust jacket

I agree, I ported the code from the java version and I tried to be idiomatic but it's hard when you port code you don't fully understand. Also code in const fn has to be none idiomatic to compile (no for loop for example).

I would gladly make improvement to the code base but I'm waiting for my new laptop as my current one just died.

hidny commented 1 year ago

Hi @dzamlo and @datdenkikniet,

I feel I should add my two cents. It is parallelizable and I already implemented something that will divide the problem into multiple batches that could be run separately. I already got it solving the N=17 case, and if my 9 year old laptop doesn't need a restart or doesn't get so hot that I will be compelled to turn it off, it will take around 2 to 3 weeks to complete.

Unfortunately, the code that makes it parallelizable is a little hard to understand, and I'll need to make a README and add features to make it more understandable (I might even add a GUI, but I'm not sure)

I'll try to find time to get it done. I'm a bit busy and I only have around 1-3 hours of time to work on it every day. Here's the current location of my code: https://github.com/hidny/MikeTsCubeCode/tree/main

By the way, I feel like you nailed it when you said: "it just looks like a C-project in a rust jacket :P". I was thinking that I should port my code into C because my code felt like C code in a Java jacket :P.

datdenkikniet commented 1 year ago

As a heads up, it seems like @NailLegProcessorDivide has taken up the challenge of this approach as well in #28 :D

NailLegProcessorDivide commented 1 year ago

Very cool, Im going to need to see if I can find time to take a look at how this compares to my current implementation.

hidny commented 1 year ago

My program finally finished going through all the solutions for N=17, and it found 457409613979 solutions (If you want it formatted, it's: 457,409,613,979) https://github.com/hidny/MikeTsCubeCode/blob/main/src/OutputReader/GetNumSolutionsFromCubeCountOutputForNCubes.java

Unfortunately, the README doesn't really explain how/why the algorithm works in great detail. I still have to work on the arguments/documentation for why the algorithm works. I also think there's a better way to get the # of solutions. In the next month or two, I'm hoping to clean up the docs for this algorithm, and maybe have a better/faster algorithm.

datdenkikniet commented 1 year ago

I think it'd be quite cool to get the Rust version by @dzamlo into this tree somehow :) It really is amazingly fast.

Also, I've created a PR that makes it even faster! https://gitlab.com/dzamlo/polycubes2/-/merge_requests/1

NailLegProcessorDivide commented 1 year ago

wow. just started looking at @dzamlo's rust version and agree its extremely fast. would love to see it integrated into this repo some how. ~ 3x the current DFS solution in this repo and around 4.5x with @datdenkikniet's pr at least for running N=13

NailLegProcessorDivide commented 1 year ago

I did some more optimisations after @datdenkikniet's patch reusing already allocated structures (small win) and swapping the Vec3D<bool>s with a separate Vec3dBool that stores an fixed array of 64x64 u64s to pack the bools to try to minimise the memory foot print while minimising the cost of the multiplies to calculate array offsets as they go from muls to shifts.

This took the run time for n=13 down further from 22s to 17.5s on my machine. Im trying to run N=17 which I think should run on my machine within a day to see if I broke anything at larger sizes.

datdenkikniet commented 1 year ago

Nice! Any chance you could upload your patch somewhere @NailLegProcessorDivide? Would love to check it out!

Also: you're running an R9 7950X, right?

NailLegProcessorDivide commented 1 year ago

r9 7900X so "only" 12c/24t. I'll see if I can make a gitlab to push with. the code is a bit nasty and I dont like that it hardcodes the structure size but I would need to figure out a way to go through and make it variable but keeping the speed might be hard. also making the Vec3dBool "wrap" on itself would I think be safe but even anding every coordinate with 0x1f made the code ~2x slower.

NailLegProcessorDivide commented 1 year ago

https://gitlab.com/dzamlo/polycubes2/-/merge_requests/2 hopefully that link should work

I forgot to mention there was quite a bit of perf to gain in unsafely accessing the Vec3dBool as well. making vec bool 32x32x32 was a fair bit faster but is too small for N>14

NailLegProcessorDivide commented 1 year ago

number of polycubes for n=17: 457409613979, 74301822 ms

real    1238m21.828s
user    28524m59.824s
sys     9m41.177s

Finally, N=17 in a bit over 20.5hrs getting the same number as @hidny. a small amount could be saved with better work distribution across threads and any form of feed back beyond 100% utilisation for 19.5 hours would be nice to have .

datdenkikniet commented 1 year ago

I'm chipping away at getting it to compile on stable #3, and even more(!!!) performance by @NailLegProcessorDivide #2.

@dzamlo I would also like to inquire if you plan to slap an explicit license on the source so that it is clearer if/how we may incorporate it here in the future. Currently it's "unlicensed" which, AFAIK, means that all rights on the code are reserved to the original author (you).

dzamlo commented 1 year ago

It's derivative work from the code from @hidny . So I cannot relicense my code if the code from @hidny doesn't have a permissive license that allow derivative work

@hidny : could you relicense your work with a permissive licence like MIT?

hidny commented 1 year ago

I added the MIT license to the repo: https://github.com/hidny/MikeTsCubeCode/blob/main/LICENSE

NailLegProcessorDivide commented 1 year ago

I've made a fairly naive networked solution at https://gitlab.com/dzamlo/polycubes2/-/merge_requests/4 @dzamlo if you could update your repo to an MIT or other permissive licence as well it would be nice to move the fastest solution into this main repo, assuming you are happy for this to happen

dzamlo commented 1 year ago

I've updated the license. Sorry for the delay.

NailLegProcessorDivide commented 1 year ago

Thanks! no worries about the delay.

mikepound / opencubes

Solution where memory usage isn't an issue #27