Closed dyang37 closed 2 years ago
If you are testing on Purdue cluster, please use the following when prompted for the cluster options (and change yang1467
to be your username on the cluster):
Config file path not provided. Please provide the correct answers to the following questions, so that we could set up the multinode computation:
Please enter the type of job queuing system in your cluster. One of 'SGE' (Sun Grid Engine) and 'SLURM'. SLURM
Please enter the number of physical cores in a node. [Default = 16] 12
Please enter the number of nodes for parallel computation. [Default = 1] 4
Please enter the maximum allowable walltime.This should be a string in the form D-HH:MM:SS. E.g., '0-01:00:00' for one hour. 0-01:00:00
Please enter the maximum memory per node. [Default = 16GB] E.g. '100MB' or '16GB'. If None, the scheduler will allocate a system-determined amount per node. 64GB
Please enter any additional arguments to pass to the job scheduling system. [Default = ""] Consult your local documentation or system administrator. -A bouman -N 1
Please enter a desired local directory for file spilling in parallel computation. [Default = "./"] Recommend to set it to a location of fast local storage like /scratch or $TMPDIR. /scratch/brown/yang1467/
Please enter a desired directory to store Dask's job scheduler logs. [Default = "./"] For each reserved node, there will be two different log files, error log and output log. Users can check those log files to find the information printed from the parallel functions. /scratch/brown/yang1467/ Cluster config file saved at ./configs/multinode/default.yaml. Press Enter to continue:
- Gilbreth:
Config file path not provided. Please provide the correct answers to the following questions, so that we could set up the multinode computation:
Please enter the type of job queuing system in your cluster. One of 'SGE' (Sun Grid Engine) and 'SLURM'. SLURM
Please enter the number of physical cores in a node. [Default = 16] 12
Please enter the number of nodes for parallel computation. [Default = 1] 4
Please enter the maximum allowable walltime.This should be a string in the form D-HH:MM:SS. E.g., '0-01:00:00' for one hour. 0-01:00:00
Please enter the maximum memory per node. [Default = 16GB] E.g. '100MB' or '16GB'. If None, the scheduler will allocate a system-determined amount per node. 128GB
Please enter any additional arguments to pass to the job scheduling system. [Default = ""] Consult your local documentation or system administrator. -A bouman -N 1 --gpus-per-node=1
Please enter a desired local directory for file spilling in parallel computation. [Default = "./"] Recommend to set it to a location of fast local storage like /scratch or $TMPDIR. /scratch/gilbreth/yang1467/
Please enter a desired directory to store Dask's job scheduler logs. [Default = "./"] For each reserved node, there will be two different log files, error log and output log. Users can check those log files to find the information printed from the parallel functions. /scratch/gilbreth/yang1467/ Cluster config file saved at ./configs/multinode/default.yaml. Press Enter to continue:
All demos work in the brown cluster. For each mace iteration, it took about 1 minute to finish.
############################## Initialized Model Successfully... Multinode computation with Dask: Performing MACE reconstruction ... initializing MACE... Computing qGGMRF reconstruction at all time points. This will be used as MACE initialization point. Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 293.876441 s Done computing qGGMRF reconstruction. Elapsed time: 293.92 sec. Save qGGMRF reconstruction to disk. Begin MACE ADMM iterations: Begin MACE iteration 0/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 46.020487 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.77 sec Done MACE iteration 0/10. Elapsed time: 71.58 sec. Begin MACE iteration 1/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 45.885041 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.38 sec Done MACE iteration 1/10. Elapsed time: 72.08 sec. Begin MACE iteration 2/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 46.611775 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.35 sec Done MACE iteration 2/10. Elapsed time: 72.73 sec. Begin MACE iteration 3/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 47.376159 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.29 sec Done MACE iteration 3/10. Elapsed time: 73.86 sec. Begin MACE iteration 4/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 46.363377 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.34 sec Done MACE iteration 4/10. Elapsed time: 75.49 sec. Begin MACE iteration 5/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 46.218158 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.31 sec Done MACE iteration 5/10. Elapsed time: 75.91 sec. Begin MACE iteration 6/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 46.088259 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.27 sec Done MACE iteration 6/10. Elapsed time: 72.45 sec. Begin MACE iteration 7/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 46.029578 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.38 sec Done MACE iteration 7/10. Elapsed time: 73.17 sec. Begin MACE iteration 8/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 33.570875 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.32 sec Done MACE iteration 8/10. Elapsed time: 61.48 sec. Begin MACE iteration 9/10: Got 4 nodes out of 4 nodes in 0 s Got 4 nodes, start parallel computation. Parallel Elapsed time: 31.831774 s Done forward model proximal map estimation. Done denoising in all hyper-planes, elapsed time 20.38 sec Done MACE iteration 9/10. Elapsed time: 61.44 sec. Done MACE reconstruction! Reconstruction shape = (8, 30, 121, 121)
OK, I ran into a problem running demo/demo_mace3D_fast.py
.
I listed out what went wrong bellow.
It looks like it can't find skimage
.
This might be related to the fact that I had to modify the demo/requirements.txt
to scipy~=1.8
.
Otherwise, the install scripts were not running properly.
(mbircone) bouman@Bouman-iMac-2020:~/Documents/GITHub/mbircone/demo (pr/mace4D)$ python demo_mace3D_fast.py
WARNING:tensorflow:From /usr/local/anaconda3/envs/mbircone/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:101: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
File "demo_mace3D_fast.py", line 8, in <module>
import demo_utils, denoiser_utils
File "/Users/bouman/Documents/GITHub/mbircone/demo/denoiser_utils.py", line 7, in <module>
from skimage.filters import gaussian
ModuleNotFoundError: No module named 'skimage'
Thank you for catching that! I think this is not because of the scipy version. I'll add scikit-image to requirements_demo.py and test it with a clean install. Regards, Diyu
From: Charles A Bouman @.> Sent: Thursday, April 7, 2022 4:38 PM To: cabouman/mbircone @.> Cc: Diyu Yang @.>; Author @.> Subject: Re: [cabouman/mbircone] mace 3D & 4D PR (PR #52)
OK, I ran into a problem running demo/demo_mace3D_fast.py. I listed out what went wrong bellow. It looks like it can't find skimage.
This might be related to the fact that I had to modify the demo/requirements.txt to scipy~=1.8. Otherwise, the install scripts were not running properly.
(mbircone) @.***:~/Documents/GITHub/mbircone/demo (pr/mace4D)$ python demo_mace3D_fast.py
WARNING:tensorflow:From /usr/local/anaconda3/envs/mbircone/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:101: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
File "demo_mace3D_fast.py", line 8, in
— Reply to this email directly, view it on GitHubhttps://github.com/cabouman/mbircone/pull/52#issuecomment-1092179832, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB2Q7OFWZ6QSBZB77IW3QMTVD5BV3ANCNFSM5ST3U7LQ. You are receiving this because you authored the thread.Message ID: @.***>
You should always run dev_scripts/clean_install_all.sh before testing the package. That way you know that you have the same environment as a new user would have.
Charlie
You should always run dev_scripts/clean_install_all.sh before testing the package. That way you know that you have the same environment as a new user would have.
Charlie
Added skimage into demo requirements. Tested by running clean_install_all.sh then demo_mace3D_fast.py. Everything works on my end.
It works, and the 3D images look great!
Fantastic!! I'll pour a glass of wine to celebrate getting MACE into master branch!
From: Charles A Bouman @.> Sent: Thursday, April 7, 2022 5:16 PM To: cabouman/mbircone @.> Cc: Diyu Yang @.>; Author @.> Subject: Re: [cabouman/mbircone] mace 3D & 4D PR (PR #52)
It works, and the 3D images look great!
— Reply to this email directly, view it on GitHubhttps://github.com/cabouman/mbircone/pull/52#issuecomment-1092206748, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB2Q7OAYQKCIGSL4CN65HODVD5GB3ANCNFSM5ST3U7LQ. You are receiving this because you authored the thread.Message ID: @.***>
This is a pull request to merge mace3D and mace4D into master.
To test:
git checkout pr/mace4D
,git pull
pip install -r requirements.txt
pip install .
cd demo
,pip install -r requirements_demo.txt
rm -r ~/.cache/mbircone/
Both demo should take about 5-10 minutes on cluster, and 15-20 minutes on a personal computer.
In addition, if you have access to the cluster and would like to perform a higher quality demo, you could run
demo_mace3D.py
anddemo_mace4D.py
(basically 3D/4D reconstructions with higher resolution phantoms).