Macaulay2 / M2

The primary source code repository for Macaulay2, a system for computing in commutative algebra, algebraic geometry and related fields.
https://macaulay2.com
343 stars 230 forks source link

Memory intensive module #3027

Open mahrud opened 9 months ago

mahrud commented 9 months ago

I have a module which was computed very painstakingly, so immediately afterwards I stored the module externally in a file that's about 95MB. The presentation is a 2055x15981 matrix over ring in 16 variables over GF(4).

However, when trying to load the module again with M = value get "M.m2", M2 seems to require more memory than the system has available! (At least 6GB, but it doesn't seem to be stopping)

I tried to Ctrl-C out of loading and this was the backtrace:

-* stack trace, pid: 534459
 0# stack_trace(std::ostream&, bool) at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/bin/main.cpp:123
 1# interrupt_handler at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/bin/main.cpp:282
 2# 0x00007F8523C5FBB0 in /lib64/libc.so.6
 3# binding_lookup_1 at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/binding.d:433
 4# lookup at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/binding.d:435
 5# binding_bind at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/binding.d:725
 6# binding_bind at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/binding.d:724
...
9065# binding_bind at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/binding.d:724
9066# binding_localBind at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/binding.d:815
9067# readeval3(parse_TokenFile_struct*, char, parse_DictionaryClosure_struct*, char, char, char) at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/interp.dd:271
9068# readeval(parse_TokenFile_struct*, char, char) at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/interp.dd:284
9069# value(tagged_union*) at /tmp/macaulay2-20231103-41331-9ughdm/M2-release-1.22/M2/Macaulay2/d/interp.dd:509

(I deleted roughly 9000 repeated lines in the middle.) What gives? Is this a bug, or is there a better way to load a module?

@mikestillman have you ever ran into this sort of issue?

mahrud commented 9 months ago

Sidenote: perhaps a function zget for reading a gzipped file (and perhaps corresponding function for writing gzipped data) would be useful for compressing the text based data that some packages have stored as large examples.

mahrud commented 9 months ago

I was able to load the module by parsing the presentation matrix row by row:

elapsedTime M = coker matrix(value \ select("{[^{}]*}", get "mat.m2"));

However, 9GB of storage and 3min later, my system has no memory to left to do anything else with it! Is there a less memory intensive way to construct a matrix?

mikestillman commented 9 months ago

@mahrud There is a branch (mikestillman M2, branch read-msolve) that has an engine function for reading quickly msolve format matrix files (and Martin Helmer's package for interfacing with msolve). The idea is for it to be able to read in matrices from M2 (i.e. not just created from msolve). It somewhat works, but has no real error checking yet. It only currently works over finite prime fields, and maybe only for one row matrices. Your example sounds like it would be a good benchmark case for reading in. Can you share that matrix with me?

mikestillman commented 9 months ago

@mahrud How sparse is your matrix? The code in that branch (read-msolve) actually doesn't handle sparse matrices either. It seems like we should make that function handle sparse matrices, with entries in GF(q), and also ZZ/QQ. Also, an easy function to write the matrix to disk in the same format.

mahrud commented 9 months ago

I'll email it to you.

mahrud commented 1 month ago

Mike, I suspect the problem doesn't have to do with the actual reading of the file, but rather turning it into a matrix. Do you know why loading a large matrix from memory would require more memory than when it was defined?