alihaydaroglu / suite3d

Fast, accurate, volumetric cell detection. Developed for Light Beads Microscopy, usable for other volumetric 2P. In development
6 stars 0 forks source link

Memory usage keeps going up during motion correction #16

Closed oterocoronel closed 1 year ago

oterocoronel commented 1 year ago

image

I am trying to process an LBM recording with 15 planes, so I changed some stuff during data-loading but I don't think that should affect this.

This is from the log:

[2023-07-18 19:25:22][00] Loading Batch 34 of 58
[2023-07-18 19:25:22][01]    Batch 33 IO thread joined
[2023-07-18 19:25:22][03]          After IO thread joinTotal Used: 390.919 GB, Virtual Available: 074.914 GB, Virtual Used: 176.491 GB, Swap Used: 214.428 GB
[2023-07-18 19:25:22][01]    Subtracting min vals to enfore positivity
[2023-07-18 19:25:24][03]          After Sharr creation:Total Used: 393.037 GB, Virtual Available: 072.797 GB, Virtual Used: 178.609 GB, Swap Used: 214.428 GB
[2023-07-18 19:25:24][01]    Launching IO thread for next batch
[2023-07-18 19:25:24][20]                                                                [Thread] Loading batch 34 

[2023-07-18 19:25:24][03]          After IO thread launch:Total Used: 393.037 GB, Virtual Available: 072.797 GB, Virtual Used: 178.609 GB, Swap Used: 214.428 GB
[2023-07-18 19:25:24][02]       Loading /home/freiwald/Data/analysis_2pRAM/Dali/20230620d/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW_00001_00033.tif
[2023-07-18 19:25:26][01]    Registering Batch 33
[2023-07-18 19:25:26][03]          Before Reg:         Total Used: 393.428 GB, Virtual Available: 072.406 GB, Virtual Used: 179.000 GB, Swap Used: 214.428 GB
[2023-07-18 19:25:26][01]    
[2023-07-18 19:25:26][02]       Registering plane 0
[2023-07-18 19:25:30][02]       Registering plane 1
[2023-07-18 19:25:46][01]    Loaded 1 files, total 2.12 GB
[2023-07-18 19:25:46][20]                                                                [Thread] Loaded batch 34 

[2023-07-18 19:25:46][20]                                                                [Thread] Thread for batch 34 ready to join 

[2023-07-18 19:25:46][02]       Registering plane 2
[2023-07-18 19:26:04][02]       Registering plane 3
[2023-07-18 19:26:24][02]       Registering plane 4
[2023-07-18 19:26:39][02]       Registering plane 5
[2023-07-18 19:27:02][02]       Registering plane 6
[2023-07-18 19:27:17][02]       Registering plane 7
[2023-07-18 19:27:20][02]       Registering plane 8
[2023-07-18 19:27:23][02]       Registering plane 9
[2023-07-18 19:27:26][02]       Registering plane 10
[2023-07-18 19:27:29][02]       Registering plane 11
[2023-07-18 19:27:32][02]       Registering plane 12
[2023-07-18 19:27:34][02]       Registering plane 13
[2023-07-18 19:27:37][02]       Registering plane 14
[2023-07-18 19:27:40][02]       Saving registered file of shape (15, 100, 868, 876) to /home/freiwald/Data/analysis_2pRAM/Dali/20230620d/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW/s3d-Coconut-Demo/registered_data/reg_data0033.npy
[2023-07-18 19:27:49][03]          After reg:          Total Used: 402.238 GB, Virtual Available: 082.514 GB, Virtual Used: 168.892 GB, Swap Used: 233.346 GB
[2023-07-18 19:27:49][03]          After close + unlink shmem:Total Used: 402.242 GB, Virtual Available: 082.510 GB, Virtual Used: 168.896 GB, Swap Used: 233.346 GB
[2023-07-18 19:27:49][02]       Garbage collected 2005 items
[2023-07-18 19:27:49][03]          After gc collect:   Total Used: 402.377 GB, Virtual Available: 082.375 GB, Virtual Used: 169.031 GB, Swap Used: 233.346 GB
[2023-07-18 19:27:49][03]          Start Batch:        Total Used: 402.377 GB, Virtual Available: 082.375 GB, Virtual Used: 169.031 GB, Swap Used: 233.346 GB
[2023-07-18 19:27:49][00] Loading Batch 35 of 58
[2023-07-18 19:27:49][01]    Batch 34 IO thread joined
[2023-07-18 19:27:49][03]          After IO thread joinTotal Used: 402.377 GB, Virtual Available: 082.375 GB, Virtual Used: 169.031 GB, Swap Used: 233.346 GB
[2023-07-18 19:27:49][01]    Subtracting min vals to enfore positivity
[2023-07-18 19:27:51][03]          After Sharr creation:Total Used: 404.509 GB, Virtual Available: 080.243 GB, Virtual Used: 171.163 GB, Swap Used: 233.346 GB
[2023-07-18 19:27:51][01]    Launching IO thread for next batch
[2023-07-18 19:27:51][20]                                                                [Thread] Loading batch 35 

[2023-07-18 19:27:51][03]          After IO thread launch:Total Used: 404.509 GB, Virtual Available: 080.243 GB, Virtual Used: 171.163 GB, Swap Used: 233.346 GB
[2023-07-18 19:27:51][02]       Loading /home/freiwald/Data/analysis_2pRAM/Dali/20230620d/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW_00001_00034.tif
[2023-07-18 19:27:53][01]    Registering Batch 34
[2023-07-18 19:27:53][03]          Before Reg:         Total Used: 404.889 GB, Virtual Available: 079.863 GB, Virtual Used: 171.543 GB, Swap Used: 233.346 GB
[2023-07-18 19:27:53][01]    
[2023-07-18 19:27:53][02]       Registering plane 0
[2023-07-18 19:27:57][02]       Registering plane 1
[2023-07-18 19:28:02][02]       Registering plane 2
[2023-07-18 19:28:04][01]    Loaded 1 files, total 2.12 GB
[2023-07-18 19:28:04][20]                                                                [Thread] Loaded batch 35 

[2023-07-18 19:28:04][20]                                                                [Thread] Thread for batch 35 ready to join 

[2023-07-18 19:28:26][02]       Registering plane 3
[2023-07-18 19:28:33][02]       Registering plane 4
[2023-07-18 19:28:36][02]       Registering plane 5
[2023-07-18 19:28:39][02]       Registering plane 6
[2023-07-18 19:28:42][02]       Registering plane 7
[2023-07-18 19:28:45][02]       Registering plane 8
[2023-07-18 19:28:48][02]       Registering plane 9
[2023-07-18 19:28:50][02]       Registering plane 10
[2023-07-18 19:28:53][02]       Registering plane 11
[2023-07-18 19:28:56][02]       Registering plane 12
[2023-07-18 19:28:59][02]       Registering plane 13
[2023-07-18 19:29:01][02]       Registering plane 14
[2023-07-18 19:29:04][02]       Saving registered file of shape (15, 100, 868, 876) to /home/freiwald/Data/analysis_2pRAM/Dali/20230620d/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW/s3d-Coconut-Demo/registered_data/reg_data0034.npy
[2023-07-18 19:29:12][03]          After reg:          Total Used: 409.086 GB, Virtual Available: 075.711 GB, Virtual Used: 175.695 GB, Swap Used: 233.391 GB
[2023-07-18 19:29:12][03]          After close + unlink shmem:Total Used: 409.086 GB, Virtual Available: 075.711 GB, Virtual Used: 175.695 GB, Swap Used: 233.391 GB
[2023-07-18 19:29:12][02]       Garbage collected 1972 items
[2023-07-18 19:29:12][03]          After gc collect:   Total Used: 409.225 GB, Virtual Available: 075.572 GB, Virtual Used: 175.834 GB, Swap Used: 233.391 GB
[2023-07-18 19:29:12][03]          Start Batch:        Total Used: 409.225 GB, Virtual Available: 075.572 GB, Virtual Used: 175.834 GB, Swap Used: 233.391 GB
[2023-07-18 19:29:12][00] Loading Batch 36 of 58
[2023-07-18 19:29:12][01]    Batch 35 IO thread joined
[2023-07-18 19:29:12][03]          After IO thread joinTotal Used: 409.225 GB, Virtual Available: 075.572 GB, Virtual Used: 175.834 GB, Swap Used: 233.391 GB
[2023-07-18 19:29:12][01]    Subtracting min vals to enfore positivity
[2023-07-18 19:29:14][03]          After Sharr creation:Total Used: 411.358 GB, Virtual Available: 073.439 GB, Virtual Used: 177.967 GB, Swap Used: 233.391 GB
[2023-07-18 19:29:14][01]    Launching IO thread for next batch
[2023-07-18 19:29:14][20]                                                                [Thread] Loading batch 36 

[2023-07-18 19:29:14][03]          After IO thread launch:Total Used: 411.358 GB, Virtual Available: 073.439 GB, Virtual Used: 177.967 GB, Swap Used: 233.391 GB
[2023-07-18 19:29:14][02]       Loading /home/freiwald/Data/analysis_2pRAM/Dali/20230620d/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW_00001_00035.tif
[2023-07-18 19:29:15][01]    Registering Batch 35
[2023-07-18 19:29:15][03]          Before Reg:         Total Used: 411.740 GB, Virtual Available: 073.057 GB, Virtual Used: 178.349 GB, Swap Used: 233.391 GB
[2023-07-18 19:29:15][01]    
[2023-07-18 19:29:15][02]       Registering plane 0
[2023-07-18 19:29:50][02]       Registering plane 1
[2023-07-18 19:29:51][01]    Loaded 1 files, total 2.12 GB
[2023-07-18 19:29:51][20]                                                                [Thread] Loaded batch 36 

[2023-07-18 19:29:51][20]                                                                [Thread] Thread for batch 36 ready to join 

[2023-07-18 19:30:11][02]       Registering plane 2
[2023-07-18 19:30:35][02]       Registering plane 3
[2023-07-18 19:30:59][02]       Registering plane 4
[2023-07-18 19:31:42][02]       Registering plane 5
[2023-07-18 19:31:49][02]       Registering plane 6
[2023-07-18 19:31:54][02]       Registering plane 7
[2023-07-18 19:31:57][02]       Registering plane 8
[2023-07-18 19:32:03][02]       Registering plane 9
[2023-07-18 19:32:06][02]       Registering plane 10
[2023-07-18 19:32:10][02]       Registering plane 11
[2023-07-18 19:32:13][02]       Registering plane 12
[2023-07-18 19:32:17][02]       Registering plane 13
[2023-07-18 19:32:20][02]       Registering plane 14
[2023-07-18 19:32:24][02]       Saving registered file of shape (15, 100, 868, 876) to /home/freiwald/Data/analysis_2pRAM/Dali/20230620d/154951tUTC_Max15_depth400um_fov2628x2600um_res3p00x3p00umpx_fr04p482Hz_pow299p9mW/s3d-Coconut-Demo/registered_data/reg_data0035.npy
[2023-07-18 19:32:32][03]          After reg:          Total Used: 417.592 GB, Virtual Available: 072.231 GB, Virtual Used: 179.174 GB, Swap Used: 238.418 GB
[2023-07-18 19:32:32][03]          After close + unlink shmem:Total Used: 417.592 GB, Virtual Available: 072.231 GB, Virtual Used: 179.174 GB, Swap Used: 238.418 GB
[2023-07-18 19:32:33][02]       Garbage collected 2005 items
[2023-07-18 19:32:33][03]          After gc collect:   Total Used: 417.732 GB, Virtual Available: 072.091 GB, Virtual Used: 179.314 GB, Swap Used: 238.418 GB
[2023-07-18 19:32:33][03]          Start Batch:        Total Used: 417.732 GB, Virtual Available: 072.091 GB, Virtual Used: 179.314 GB, Swap Used: 238.418 GB
[2023-07-18 19:32:33][00] Loading Batch 37 of 58
oterocoronel commented 1 year ago

Eventually, the run shown above crashed because it ran oom. However, after restarting the kernel, the issue is fixed and it is using little RAM... Maybe it was just something due to a buggy initialization? Probably safe to close this issue

alihaydaroglu commented 1 year ago

Thanks for sharing the log as well. This commit should have fixed this issue: https://github.com/alihaydaroglu/suite2p/commit/874df9fc88698ed0f61326369a72d2e58e315dfa

If you have already updated and are still running into let me know, if not I'll close the issue for now

alihaydaroglu commented 1 year ago

FYI if your registration step fails, you can restart at a specific batch using the 'start_batch_idx' param on job.register so you don't re register all the batches before the crash

alihaydaroglu commented 1 year ago

I reopen this because it seems like you were already on the latest version, confirmed by the print statements. It seems like the two lines shmem_mov.close(); shmem_mov.unlink() didn't work as intended in this case. The expected behaviour is that the registration uses a constant amount of memory throughout, it shouldn't accumulate.

It doesn't happen for me, and it seems not to have happened after re-running it, but please let me know if you encounter the issue again.

alihaydaroglu commented 1 year ago

issue with RAM going up seems fixed, but other memory considerations are continued in #39 , so closing this on