CEED / Laghos

High-order Lagrangian Hydrodynamics Miniapp
http://ceed.exascaleproject.org/miniapps
BSD 2-Clause "Simplified" License
187 stars 60 forks source link

Rom randomized svd #123

Closed kevinhkhuynh closed 3 years ago

kevinhkhuynh commented 3 years ago

Integrate PR 52 in libROM with LaghosROM

kevinhkhuynh commented 3 years ago

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvds Total time: 1.27e+02 sec

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 46 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000748932 0: run/Sol_Velocity Rel. DIFF norm 0.0032269713 0: run/Sol_Energy Rel. DIFF norm 0.0030694990 Total time: 1.29e+02 sec

Let's compare randomized SVD to the static SVD above using the same number of dimensions.

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvdrm Total time: 1.39e+02 sec (We didn't reduce the dimension so it should take longer due to all the extra computation when the input to the SVD stays the same size).

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 46 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000748932 0: run/Sol_Velocity Rel. DIFF norm 0.0032269713 0: run/Sol_Energy Rel. DIFF norm 0.0030694990 Total time: 1.31e+02 sec

This is identical to StaticSVD. This makes sense since we didn't reduce the dimension.

Parallel gives the same l2 rel. diff norm as serial for this problem. Same singular values, basis, and right basis as well.

There are 242 samples. Let's try using a randomized subspace_dim of 50.

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvdrm --rmsubspace_dim 50

Total time: 1.25e+02 sec

There doesn’t seem to be much speed-up because computing Q*A (k x DOF times DOF x num_samples) is slow in libROM even though the input to the SVD is so much smaller (k x d_num_samples).

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 40 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000738725 0: run/Sol_Velocity Rel. DIFF norm 0.0032556758 0: run/Sol_Energy Rel. DIFF norm 0.0030305856 Total time: 1.26e+02 sec

chldkdtn commented 3 years ago

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvds Total time: 1.27e+02 sec

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 46 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000748932 0: run/Sol_Velocity Rel. DIFF norm 0.0032269713 0: run/Sol_Energy Rel. DIFF norm 0.0030694990 Total time: 1.29e+02 sec

Let's compare randomized SVD to the static SVD above using the same number of dimensions.

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvdrm Total time: 1.39e+02 sec (We didn't reduce the dimension so it should take longer due to all the extra computation when the input to the SVD stays the same size).

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 46 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000748932 0: run/Sol_Velocity Rel. DIFF norm 0.0032269713 0: run/Sol_Energy Rel. DIFF norm 0.0030694990 Total time: 1.31e+02 sec

Parallel gives the same l2 rel. diff norm as serial for this problem. Same singular values, basis, and right basis as well.

There are 242 samples. Let's try using a randomized subspace_dim of 50.

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvdrm --rmsubspace_dim 50

Total time: 1.25e+02 sec

There doesn’t seem to be much speed-up because computing Q*A (k x DOF times DOF x num_samples) is slow in libROM even though the input to the SVD is so much smaller (k x d_num_samples).

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 40 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000738725 0: run/Sol_Velocity Rel. DIFF norm 0.0032556758 0: run/Sol_Energy Rel. DIFF norm 0.0030305856 Total time: 1.26e+02 sec

@kevinhkhuynh I am assuming that your "Total time" report includes the time to simulate Laghos problem plus the time to do SVD, right? If yes, would you only measure the time for either staticSVD or randomizedSVD and report them? Also, "--rmsubspace_dim 50" specifies k value in the randomized SVD, right? Since we have three different bases to compute, i.e., for velocity, energy, and position, we could actually set different k value for each field, e.g., set k=6 for position, k=40 for velocity, and k=15 for energy.

kevinhkhuynh commented 3 years ago

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvds Total time: 1.27e+02 sec srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 46 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000748932 0: run/Sol_Velocity Rel. DIFF norm 0.0032269713 0: run/Sol_Energy Rel. DIFF norm 0.0030694990 Total time: 1.29e+02 sec Let's compare randomized SVD to the static SVD above using the same number of dimensions. srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvdrm Total time: 1.39e+02 sec (We didn't reduce the dimension so it should take longer due to all the extra computation when the input to the SVD stays the same size). srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 46 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000748932 0: run/Sol_Velocity Rel. DIFF norm 0.0032269713 0: run/Sol_Energy Rel. DIFF norm 0.0030694990 Total time: 1.31e+02 sec Parallel gives the same l2 rel. diff norm as serial for this problem. Same singular values, basis, and right basis as well. There are 242 samples. Let's try using a randomized subspace_dim of 50. srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvdrm --rmsubspace_dim 50 Total time: 1.25e+02 sec There doesn’t seem to be much speed-up because computing Q*A (k x DOF times DOF x num_samples) is slow in libROM even though the input to the SVD is so much smaller (k x d_num_samples). srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -online -rdimx 6 -rdimv 40 -rdime 15 -sfacx 6 -sfacv 20 -sface 2 -soldiff 0: run/Sol_Position Rel. DIFF norm 0.0000738725 0: run/Sol_Velocity Rel. DIFF norm 0.0032556758 0: run/Sol_Energy Rel. DIFF norm 0.0030305856 Total time: 1.26e+02 sec

@kevinhkhuynh I am assuming that your "Total time" report includes the time to simulate Laghos problem plus the time to do SVD, right? If yes, would you only measure the time for either staticSVD or randomizedSVD and report them? Also, "--rmsubspace_dim 50" specifies k value in the randomized SVD, right? Since we have three different bases to compute, i.e., for velocity, energy, and position, we could actually set different k value for each field, e.g., set k=6 for position, k=40 for velocity, and k=15 for energy.

Yes, you are right.

Static

Elapsed time for sampling in the offline phase: 5.63e+00 sec

This time includes all the time taken to take the samples + construct the basis. Let me get another timer with just the basis construction.

Randomized

srun -n 1 -p pdebug laghos -m data/cube01_hex.mesh -pt 211 -tf 0.1 -offline -writesol -romsvdrm -randdimx 6 -randdimv 46 -randdime 15 Elapsed time for sampling in the offline phase: 1.84e+00 sec

Online

0: run/Sol_Position Rel. DIFF norm 0.0004200262 0: run/Sol_Velocity Rel. DIFF norm 0.0230247076 0: run/Sol_Energy Rel. DIFF norm 0.0129698347

kevinhkhuynh commented 3 years ago

@chldkdtn Same options as above. tf = 0.1

Static Elapsed time for basis construction in the offline phase: 5.19e+00 sec

Randomized (-randdimx 6 -randdimv 46 -randdime 15) Elapsed time for basis construction in the offline phase: 2.36e+00 sec

Speedup (staticSVD time / randomizedSVD time) = 2.19