josedolz / LiviaNET

This repository contains the code of LiviaNET, a 3D fully convolutional neural network that was employed in our work: "3D fully convolutional networks for subcortical segmentation in MRI: A large-scale study"
MIT License
161 stars 52 forks source link

A problem about NAN #20

Open Simon1zxb opened 6 years ago

Simon1zxb commented 6 years ago

Hi josedolz, So sorry to bother you. I am trying to use your network to do some experiment. But I find that when I train my own model. The cost begin to become NAN after one train step. I really don't know why. Did you meet the same problem before? The parameter of the network is set as your paper's first architecture.(kernel 777, input 272727) And I change the code to fit the Python3.

ytliang97 commented 4 years ago

same problem. I set LiviaNet_Config.ini

n_classes = 5
number Of Epochs = 10
number Of SubEpochs = 1000
imageTypes = 0
# custom dataset, no ROI
# other setting is the same

image

I think it is because imagesSamplesAll=0 in src/LiviaNet/startTraining.py. And it cause numberBatches=0, so the model stop training.

https://github.com/josedolz/LiviaNET/blob/ac01462623e5a8a49f9f5ad172d4ec76681c6b80/src/LiviaNet/startTraining.py#L129-L162

I still try to figure out why getSamplesSubepoch() will output 0 https://github.com/josedolz/LiviaNET/blob/ac01462623e5a8a49f9f5ad172d4ec76681c6b80/src/LiviaNet/startTraining.py#L129-L138

josedolz commented 4 years ago

Hi, it is a long time since last time I used this code, but I recall that this comes from the fact that there are 0 samples to process (typically from a wrong path pointing to the image files). Check whether the files can be found and loaded, and then why the number of samples is equal to 0.

Best,