txusser / simpet

SimPET is a framework intended to setup and launch PET imaging Monte Carlo simulations on a simple way. It uses popular tools such as SimSET and STIR
8 stars 8 forks source link

RAM problems with Siemens scanners #22

Closed txusser closed 5 months ago

txusser commented 1 year ago

Thank you very much for the advice. I have followed above procedure and now it is working. I was able to generate images for GE Discovery ST and currently running on Siemens mMR.

I have noticed an issue only when "randoms" are enabled (when add_randoms: 1 in yaml config). I am getting segmentation fault for Siemens scanners both in main and develop branches as shown in below log, (GE Discovery ST works fine as I remember).

Is there any special system requirement for Siemens scanners ? I have allocated 4 cores and 32 GB RAM for VM running Ubuntu 22.04.01. I can open a separate issue for if it helps with the troubleshooting. :)

(spdev) simpet@simpet-vm:~/simpet-devel/simpet/scripts$ python experiment.py
/home/simpet/miniconda3/envs/spdev/lib/python3.9/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'config_test': Defaults list is missing `_self_`. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/default_composition_order for more information
  warnings.warn(msg, UserWarning)
Starting simulation for division_0
WARNING: add_randoms=1, so simulation is forced to realistic noise
Importance sampling is also being deactivated
All these means the simulation can take very long...
Starting simulation for division_1
WARNING: add_randoms=1, so simulation is forced to realistic noise
Importance sampling is also being deactivated
All these means the simulation can take very long...
Segmentation fault (core dumped)
Process Process-2:
Segmentation fault (core dumped)
Process Process-1:

Originally posted by @prabathbr in https://github.com/txusser/simpet/issues/16#issuecomment-1692976527

prabathbr commented 1 year ago

Thank you very much for opening this issue. I will run it again and provide logs ASAP. Sorry for the delay!

prabathbr commented 1 year ago
  1. can you send the end of the simset logs? They are located in the results dir and named simset_s0 ans simset_s1.
  2. can you use htop, btop or similar to check memory usage while running and report back?

I have tested a simple "Siemens_mCT" simulation with provided test image using following test.yaml (output_dir: "test_image_smsmmr" is just a typing mistake ;) ). This time I was testing in a different PC with VM size of 16 cores, 128 GB RAM.

defaults:
 - scanner: siemens_mct

simulation_environment: 0
sim_type: "SimSET"
do_simulation: 1
do_reconstruction: 1
divisions: 2
model_type: "cylindrical"
patient_dirname: "test_image"
act_map: "act.hdr"
att_map: "att.hdr"
output_dir: "test_image_smsmmr"
center_slice: 0
total_dose: 0.1
simulation_time: 30
sampling_photons: 0
photons: 0
add_randoms: 1 
phglistmode: 0

detlistmode: 1
maximumIteration: 1

I got the error right at the start (attached error.txt) . I could only find simset_s0.log at division_0/simset_s0.log simset_s0.log. It ends at ***************** Simulation Beginning ************* unlike in a normal file which shows progress. Therefore, it looks that phg crashes at the start itself. I have also attached htop of the same config but with add_randoms: 0 just for comparison.

(spdev) simpet@simpet-vm:~/simpet-devel/simpet/scripts$ ls /home/simpet/simpet-devel/simpet/Results/test_image_smsmmr/SimSET_Sim_siemens_biograph_mct
division_0  postprocessing.log

HTOP screenshot with add_randoms: 1 : htop

HTOP screenshot with add_randoms: 0 : phg running two process as expected. htop_norandoms

txusser commented 1 year ago

@YerePhy can you try to replicate this? I don't see outputs of any of the common errors.

YerePhy commented 1 year ago

Hi,

@txusser I've run a simulation+recon with Siemens mCT without any problem.

@prabathbr according to your error.txt you are not using develop branch, or at least you are not using the latest version (you're running python experiment.py instead of python scripts/experiment.py). Could you delete the repo and resintall it from the develop branch (following the instructions given there)? After that, try to run the simulation again and let's see what happens.

One additional remark, we've not tested SimPET in a VM environment. It would be nice to run it in a PC with Ubuntu installed (or at least with Ubuntu installed in a HD partition). However, I guess there are technical/time limitations to try this, so let's see if the above approach solves the problem.

prabathbr commented 1 year ago

Hi,

@txusser I've run a simulation+recon with Siemens mCT without any problem.

@prabathbr according to your error.txt you are not using develop branch, or at least you are not using the latest version (you're running python experiment.py instead of python scripts/experiment.py). Could you delete the repo and resintall it from the develop branch (following the instructions given there)? After that, try to run the simulation again and let's see what happens.

One additional remark, we've not tested SimPET in a VM environment. It would be nice to run it in a PC with Ubuntu installed (or at least with Ubuntu installed in a HD partition). However, I guess there are technical/time limitations to try this, so let's see if the above approach solves the problem.

Hi @YerePhy ,

Thanks a lot for the information. This error only happened when I enabled randoms. I can run simulation + recon without an issue when randoms are disabled.

I just noticed that develop branch has updated after my initial testing and I will try with the new version. I believe that it should be working when I change add_randoms: 1 in the new config.

YerePhy commented 1 year ago

Sorry, I didn't notice add_randoms=1, let me try again with randoms enabled. I'll let you know if I get the same error.

YerePhy commented 1 year ago

Hi @prabathbr,

I've finally reproduced the issue with add_randoms: 1. One solution that has worked for me is to halve the number of radial and angular bins, that is num_td_bins: 156, num_aa_bins: 156. Obviously, the radial and angular resolutions of the sinograms will halve as well.

As far as I know, the error Segmentation Fault (core dumped) seems to be related to memory usage. My main hypothesis is that we don't have enough RAM, neither in my PC (128GB) nor in yours.

I'll let you know if I find another solution.

prabathbr commented 1 year ago

Hi @prabathbr,

I've finally reproduced the issue with add_randoms: 1. One solution that has worked for me is to halve the number of radial and angular bins, that is num_td_bins: 156, num_aa_bins: 156. Obviously, the radial and angular resolutions of the sinograms will halve as well.

As far as I know, the error Segmentation Fault (core dumped) seems to be related to memory usage. My main hypothesis is that we don't have enough RAM, neither in my PC (128GB) nor in yours.

I'll let you know if I find another solution.

Thanks a lot for the support. I am wondering whether we can run multiple phg divisions in a sequence (instead of parallel) or use a large swap memory instead of physical memory. Is there an equation to calculate the required memory in that case ?

YerePhy commented 1 year ago

Hi @prabathbr,

I guess you're talking about running the processes sequentially instead of in parallel. I've just tried changing the run from simpet/src/simset/simset_sim.py:SimSET_Simulation method to:

def run(self):
        processes = []

        for division in range(self.divisions):
            division_dir = join(self.output_dir, "division_" + str(division))
            os.makedirs(division_dir)
            p = Process(target=self.run_simset_simulation, args=(division_dir,))
            processes.append(p)
            p.start()
            p.join()
            time.sleep(5)

        self.simulation_postprocessing()

With this change, the processes should be running sequentially. However, I'm getting the same error as before, even with 20 processes...

As far as I know, there is no such equation, sorry.

prabathbr commented 9 months ago

Hi @prabathbr,

I guess you're talking about running the processes sequentially instead of in parallel. I've just tried changing the run from simpet/src/simset/simset_sim.py:SimSET_Simulation method to:

def run(self):
        processes = []

        for division in range(self.divisions):
            division_dir = join(self.output_dir, "division_" + str(division))
            os.makedirs(division_dir)
            p = Process(target=self.run_simset_simulation, args=(division_dir,))
            processes.append(p)
            p.start()
            p.join()
            time.sleep(5)

        self.simulation_postprocessing()

With this change, the processes should be running sequentially. However, I'm getting the same error as before, even with 20 processes...

As far as I know, there is no such equation, sorry.

Hello @YerePhy,

Thank you very much for the testing and information. Looks like we don't have an alternative for the time being.

I am wondering about the meanings of the results generated by the simulation. I have had a look at source but couldn't get an clear idea.

Specially, I am wondering about sinogram before attenuation correction and after attenuation correction, but both with scatter and random correction.

additive_sinogram.hdr   
corr_randoms.hdr
my_sinogram.hdr        
scatter.hdr
attenuationsino.hdr     
corr_scatter.hdr        
randoms.hdr             
trues.hdr
txusser commented 5 months ago

I will close the issue as it seems that this is not a bug.