Semantic Segmentation for Aerial / Satellite Images with Convolutional Neural Networks including an unofficial implementation of Volodymyr Mnih's methods
I am currently trying to automate parts of this project and I am running into difficulties during the training phase using CPU mode, which throws an IndexError and appears to hang the entire training. I am using a very small dataset from the mass_buildings set, i.e. I am using 8 training images and 2 validation images. The purpose is only to test and not to have accurate results at the moment. Below is the state of the installation and steps I am using:
System:
uname -a
Linux user-VirtualBox 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Additionally, Boost 1.59.0 and OpenCV 3.0.0 have been build and installed from source and both installs appears successful. The utils is also built successfully.
I have downloaded only a small subset of the mass_buildings dataset:
# ls -R ./data/mass_buildings/train/
./data/mass_buildings/train/:
map sat
./data/mass_buildings/train/map:
22678915_15.tif 22678930_15.tif 22678945_15.tif 22678960_15.tif
./data/mass_buildings/train/sat:
22678915_15.tiff 22678930_15.tiff 22678945_15.tiff 22678960_15.tiff
Below is the output obtained by running the shells/create_datasets.sh script, modified only to build the mass_buildings data:
As you can see above, I've been using only 8 images and a single epoch. I left the entire process run an entire night and never completed. Hence the reason I believe the process simply hanged. Using nohup also does not complete. When forcefully stopped using Ctrl-C, I'm obtaining the following message:
# cat nohup.out
Traceback (most recent call last):
File "./scripts/train.py", line 313, in <module>
model, optimizer = one_epoch(args, model, optimizer, epoch, True)
File "./scripts/train.py", line 265, in one_epoch
optimizer.update(model, x, t)
File "/usr/local/lib/python3.5/dist-packages/chainer/optimizer.py", line 377, in update
loss = lossfun(*args, **kwds)
File "./models/MnihCNN_multi.py", line 31, in __call__
self.loss = F.softmax_cross_entropy(h, t, normalize=False)
File "/usr/local/lib/python3.5/dist-packages/chainer/functions/loss/softmax_cross_entropy.py", line 152, in softmax_cross_entropy
return SoftmaxCrossEntropy(use_cudnn, normalize)(x, t)
File "/usr/local/lib/python3.5/dist-packages/chainer/function.py", line 105, in __call__
outputs = self.forward(in_data)
File "/usr/local/lib/python3.5/dist-packages/chainer/function.py", line 183, in forward
return self.forward_cpu(inputs)
File "/usr/local/lib/python3.5/dist-packages/chainer/functions/loss/softmax_cross_entropy.py", line 39, in forward_cpu
p = yd[six.moves.range(t.size), numpy.maximum(t.flat, 0)]
IndexError: index 76 is out of bounds for axis 1 with size 3
This is the only components that fails at this moment. I've tested the prediction and evaluation phases using the pre-trained data and both seems to complete successfully. Any assistance on how I could use the training script using custom datasets would be appreciated.
Hello,
I am currently trying to automate parts of this project and I am running into difficulties during the training phase using CPU mode, which throws an
IndexError
and appears to hang the entire training. I am using a very small dataset from themass_buildings
set, i.e. I am using 8 training images and 2 validation images. The purpose is only to test and not to have accurate results at the moment. Below is the state of the installation and steps I am using:System:
Python (w/o Anaconda):
Python modules:
Additionally,
Boost 1.59.0
andOpenCV 3.0.0
have been build and installed from source and both installs appears successful. Theutils
is also built successfully.I have downloaded only a small subset of the
mass_buildings
dataset:Below is the output obtained by running the
shells/create_datasets.sh
script, modified only to build themass_buildings
data:Then the training script is initiated using the following command:
As you can see above, I've been using only 8 images and a single epoch. I left the entire process run an entire night and never completed. Hence the reason I believe the process simply hanged. Using
nohup
also does not complete. When forcefully stopped usingCtrl-C
, I'm obtaining the following message:This is the only components that fails at this moment. I've tested the prediction and evaluation phases using the pre-trained data and both seems to complete successfully. Any assistance on how I could use the training script using custom datasets would be appreciated.
Thank you