gmdsi / GMDSI_notebooks

python-based predictive groundwater modeling workflow examples
GNU General Public License v3.0
48 stars 31 forks source link

SVD in Tutorial 2.4.2 crashes kernel due to memory usage #95

Open JackElsey opened 9 months ago

JackElsey commented 9 months ago

I wasn't able to perform the SVD in Tutorial 2.4.2 without upgrading the RAM in my laptop from 8 GB to 16 GB. Prior to the upgrade, every time I ran the line s = la.qhalfx.s the kernel would crash. After the upgrade, it takes about a minute to complete, during which it uses 100% of all my CPU cores and peaks in memory usage at around 11 GB.

My laptop is a Framework 13 with an Intel i5-1240P processor running Ubuntu 22.04.3 LTS.

jtwhite79 commented 9 months ago

Under the hood it's just standard linalg.svd and i think thats a pretty small matrix...what size is that la.qhalfx matrix?

JackElsey commented 9 months ago

output from print(f'number of observations: {pst.nnz_obs} \nnumber of parameters: {pst.npar_adj}') is:

number of observations: 72 
number of parameters: 245

and the output from print(la.jco.shape) is:

(21248, 245)

The matrix la.qhalfx is also 21248 by 245. Is it performing SVD on a matrix with 21k rows? I don't see anything in the documentation for pyemu.ErrVar about restricting the rows for SVD to nonzero-weighted observations. Perhaps that can be configured some other way?

The rest of the notebook runs without any issues.

Here's what system resource requirements look like for the SVD:

image

jtwhite79 commented 9 months ago

What's the shape of la.qhalfx?

JackElsey commented 9 months ago

The matrix la.qhalfx is also 21248 by 245.

jtwhite79 commented 9 months ago

Ok. I was hoping that qhalfx was nnz_obs X nadj_par. I just checked and we can swap to la.xtqx.s and the results re the singular spectrum and truncation are the same. Lets plan on making this change...