sergivalverde / nicMSlesions

Easy multiple sclerosis white matter lesion segmentation using convolutional deep neural networks.
GNU General Public License v3.0
46 stars 21 forks source link

MemoryError exception #4

Closed jstutters closed 5 years ago

jstutters commented 5 years ago

Hi,

We're attempting to use nicMSlesions on data comprising of T1, FLAIR and T2 all 1x1x1mm isotropic. I'm not sure if the image size is a contributing factor but we're getting a MemoryError in base.py:load_test_patches (log below). There is some commented code that suggests that load_test_patches could yield smaller data structures instead of one large one - could that approach help?

Full log of a MemoryError session

sergivalverde commented 5 years ago

Hi @jstutters,

Thank you for the feedback.

In the example, it looks like there are 4 modalities instead of three. Is it correct? Also, can you confirm me if it's a GPU or RAM memory problem?

jstutters commented 5 years ago

Hi @sergivalverde thanks for the quick response.

The problem is occurs using both tensorflow and tensorflow-gpu and the traceback indicates that the MemoryError is triggered by a call to the numpy stack function so I'd surmise that it's a RAM memory problem. The system used to run nicMS has 32GB of RAM fitted (+ additional swap space).

Unfortunately an error was made during training that has meant MOD3 and MOD4 contain identical data. We're currently retraining with 3 channels and this will presumably help with the memory usage. Nevertheless, with I wouldn't expect this to exceed 32GB of RAM usage given the input .nii.gz files are under 30MB total.

sergivalverde commented 5 years ago

Hi again,

Ok, definitely this can be a problem of memory limitations. The model takes a set of hyper-intense voxels from the FLAIR image and builds 11^3 patches around their center. If the number of hyper-intense voxels is large enough, maybe we are limiting the RAM size of the cluster.

Can you try to pre-load the baseline model and perform the inference just with FLAIR + T1w, checking the RAM load?

jstutters commented 5 years ago

Can you try to pre-load the baseline model and perform the inference just with FLAIR + T1w, checking the RAM load?

Using the baseline model I get memory usage up to 40GB - inference does complete in that case so my problem is partly related to using more modalities but that memory usage was still resulting in a lot of swapping to disk.

I've sent a pull request that may help: #5

sergivalverde commented 5 years ago

I will look at it as soon as possible.