digraphs / Digraphs

The GAP package Digraphs
https://digraphs.github.io/Digraphs
Other
29 stars 42 forks source link

Investigate reducing the memory usage of the C module #433

Open wilfwilson opened 3 years ago

wilfwilson commented 3 years ago

...particularly in the Schreier-Sims implementation.

@ChrisJefferson reported in https://github.com/semigroups/Semigroups/issues/634#issuecomment-790661632 that when using the homomorphism finder for the first time, Digraphs allocates 1.25 GB of memory, and that moreover, this memory is kept around to be used for any subsequent runs. This is not really a problem in 64-bit, but it presents a potential problem in 32-bit mode.

This is my attempt at counting the big bits of memory that are assigned (thanks to Chris for helping):

This is over 1.1 GB already.

Allocating over 770 MB of memory for a stabiliser chain of a group on at most 512 points is a bit extravagant, especially since, I gather, the groups we'll come across in practice will probably be quite small, and on a much smaller number of points. A stabiliser chain could be computed and stored with a lot less memory, although it's unclear what the performance tradeoff would be. (Hence: investigate!).

We should at least decide whether we're happy with this situation, especially for 32-bit system. Note that the memory requirement is roughly cubic in MAXVERTS, so even a small reduction in MAXVERTS on 32-bit systems would lead to a large decrease in memory use. See #432.

james-d-mitchell commented 3 years ago

I'm happy for this to be investigated, and support anything sensible that preserves the performance of the homomorphism finding code and possibly reduces the memory requirement. I'm not personally all that worried about allocating a GB of memory, but I can see why some might be worried, and that this might be one cause of the failing CI for the Semigroups package. The key reason we have things arranged like this is for performance, previously (before you wrote the Schreier-Sims in C) we used GAP stabiliser chains and the code spent 98% of its time using those GAP stabiliser chains. The intention was also to avoid repeatedly malloc and freeing memory for the homomorphisms code, so that we can compute many homomorphisms of relatively small graphs without the time penalty for malloc and free. The approach used is probably rather simple minded, I'd be happy to consider another approach!

wilfwilson commented 3 years ago

I'm happy for us to continue to prioritise time performance, and not make any tradeoffs in that regard. As you say, and as Chris reassured me, there is no need to worry about the amount that we're allocating, in 64-bit.

Yes, it's a concern in 32-bit mode... but do we care? Perhaps this is a good opportunity to reflect on the 32-bit support of Digraphs. Not to abandon it now, while everything works pretty well, but perhaps to decide how much (if any) effort we want to spend maintaining it; and to decide how and when we might drop this support. (Perhaps we should document that 64-bit is recommended).

I'll think about ways of reducing the Schreier-Sims memory usage without reducing performance. Obviously I'll need to be able to check that any changes don't affect performance: do you feel that the tests (standard/extreme) are extensive enough that their timings can be used to decide whether performance has been changed, one way the another?

james-d-mitchell commented 3 years ago

I'll think about ways of reducing the Schreier-Sims memory usage without reducing performance. Obviously I'll need to be able to check that any changes don't affect performance: do you feel that the tests (standard/extreme) are extensive enough that their timings can be used to decide whether performance has been changed, one way the another?

Yes I think this should be noticeable in the tests, and I'm happy to see what you come up with. I think one major saving could be found by using an upper triangular array (if you see what I mean) rather than a square array to store orbits, transversals, etc in the Schreier-Sims implementation. This would complicate the arithmetic to access entries, but would cut the space required by approx. 50%. Unless I'm misremembering how this works....

james-d-mitchell commented 3 years ago

Yes, it's a concern in 32-bit mode... but do we care?

Probably not much, but at the same time, we could fairly easily use 128 for MAXVERTS on a 32-bit system, and then we'd more or less resolve this, no? Again, maybe this will fail horribly, but there's no reason in principle that it should!

wilfwilson commented 3 years ago

I think one major saving could be found by using an upper triangular array (if you see what I mean) rather than a square array to store [..] transversals

Yes this is where I'd start, so the transversals would together be about 256 MB instead of 512 MB.

james-d-mitchell commented 2 weeks ago

This will be resolved when #626 is merged!