Open kuixu opened 7 years ago
Hi Barryjui,
I updated the README to answer your question. Unfortunately, we can't release the data loader immediately, but if you do want to run the code now you can follow the indications of the paragraph "Data".
@SimJeg Thanks for your detailed description of the dataset loader so that I can repeat it.
Hi, @barrykui ,
Did you work out the dataset_loaders? If yes, could you release it on-line?
THX~
Hello.
I am trying to apply your network on my data. It is stated in README.md that "X is the batch of input images (shape= (batch_size, n_classes, n_rows, n_cols), dtype=float32) " My quwstion is: shouldn't it be shape= (batch_size, 3, n_rows, n_cols), dtype=float32, a.k.a. THREE color channels?
Thank you
Yes you're right! Sorry for the typo
Le 14 janv. 2017 09:40, "rozefeld" notifications@github.com a écrit :
Hello.
I am trying to apply your network on my data. It is stated in README.md that "X is the batch of input images (shape= (batch_size, n_classes, n_rows, n_cols), dtype=float32) " My quwstion is: shouldn't it be shape= (batch_size, 3, n_rows, n_cols), dtype=float32, a.k.a. THREE color channels?
Thank you
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SimJeg/FC-DenseNet/issues/1#issuecomment-272610845, or mute the thread https://github.com/notifications/unsubscribe-auth/AMnuVTSK0GocaV4n4vKqsfgORtFKP7Jlks5rSIoTgaJpZM4LEX6g .
Hello,
I'm also trying to use this network, but I got some errors after I implemented my own Iterator. The last several lines are listed as follows:
TypeError: Cannot convert Type TensorType(float64, 4D) (of Variable AbstractConv2d_gradInputs{border_mode='half', subsample=(1, 1), filter_flip=False, imshp=(None, 3, None, None), kshp=(48, 3, 3, 3)}.0) into Type TensorType(float32, 4D). You can try to manually convert AbstractConv2d_gradInputs{border_mode='half', subsample=(1, 1), filter_flip=False, imshp=(None, 3, None, None), kshp=(48, 3, 3, 3)}.0 into a TensorType(float32, 4D).
Does anyone meet this before? Any clue will be greatly appreciated.
It seems you have to cast your data in float32 (by default it's float64)
Le 4 févr. 2017 16:45, "Tien" notifications@github.com a écrit :
Hello,
I'm also trying to use your network, but I got some errors after I implement my own Iterator. The last several lines are listed as follows:
TypeError: Cannot convert Type TensorType(float64, 4D) (of Variable AbstractConv2d_gradInputs{border_mode='half', subsample=(1, 1), filter_flip=False, imshp=(None, 3, None, None), kshp=(48, 3, 3, 3)}.0) into Type TensorType(float32, 4D). You can try to manually convert AbstractConv2d_gradInputs{border_mode='half', subsample=(1, 1), filter_flip=False, imshp=(None, 3, None, None), kshp=(48, 3, 3, 3)}.0 into a TensorType(float32, 4D).
Does anyone meet this before? Any clue will be greatly appreciated.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/SimJeg/FC-DenseNet/issues/1#issuecomment-277454119, or mute the thread https://github.com/notifications/unsubscribe-auth/AMnuVUicMT1S7gMZXDU3cj9TbLu7TZLGks5rZJ0lgaJpZM4LEX6g .
Thanks a lot !!! It seems to work when I add 'floatX=float32' to THEANO_FLAGS.
Hi, @SimJeg . I encounter another problem, which I tried my best but failed to solve.
I test my own dataset on this network.
But, it always returns the CNMEM_STATUS_OUT_OF_MEMORY message when excuting loss, I, U, acc = f(X, Y[:, None, :, :])
inside batch_loop
function.
It is not clear to me whether this error is caused by the illegal iterator, the large intermediate values or theano limitations. Looking forward to your reply.
PS. My GPU is one piece of Titan X Maxwell (12GB mem).
The .theanorc and error message are provided as follows:
[global] floatX = float32 device = gpu2 optimizer = fast_compile optimizer_including = fusion allow_gc = True print_active_device = True optimizer_including = cudnn [lib] cnmem = 0.8 [dnn] enabled = True
..... Number of Convolutional layers : 103 Number of parameters : 9426191 Compilation starts at 2017-02-06 11:49:04 train compilation took 620.547 seconds valid compilation took 152.227 seconds Training starts at 2017-02-06 12:02:04 Traceback (most recent call last): File "train.py", line 307, in
initiate_training(cf) File "train.py", line 276, in initiate_training train(cf) File "train.py", line 198, in train history = batch_loop(train_iter, train_fn, epoch, 'train', history) File "train.py", line 89, in batch_loop loss, I, U, acc = f(X, Y[:, None, :, :]) File "/home/qiao/anaconda2/envs/tensorflow/lib/python2.7/site-packages/theano/compile/function_module.py", line 871, in call storage_map=getattr(self.fn, 'storage_map', None)) File "/home/qiao/anaconda2/envs/tensorflow/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op reraise(exc_type, exc_value, exc_trace) File "/home/qiao/anaconda2/envs/tensorflow/lib/python2.7/site-packages/theano/compile/function_module.py", line 859, in call outputs = self.fn() MemoryError: Error allocating 503316480 bytes of device memory (CNMEM_STATUS_OUT_OF_MEMORY). Apply node that caused the error: GpuElemwise{mul,no_inplace}(CudaNdarrayConstant{[[[[ 0.5]]]]}, GpuElemwise{add,no_inplace}.0) Toposort index: 5298 Inputs types: [CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D)] Inputs shapes: [(1, 1, 1, 1), (5, 96, 512, 512)] Inputs strides: [(0, 0, 0, 0), (25165824, 262144, 512, 1)] Inputs values: [CudaNdarray([[[[ 0.5]]]]), 'not shown'] Outputs clients: [[GpuContiguous(GpuElemwise{mul,no_inplace}.0)]] HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Hi, can you try to set cnmem=1 in your theanorc? cnmem=0.8 means you only use 80% of your GPU's memory
Le 6 févr. 2017 05:42, "Tien" notifications@github.com a écrit :
Hi, @SimJeg https://github.com/SimJeg . I encounter another problem, which I tried my best but failed to solve.
I test my own dataset on this network. But, it always returns the CNMEM_STATUS_OUT_OF_MEMORY message when excuting loss, I, U, acc = f(X, Y[:, None, :, :]) inside batch_loop function. It is not clear to me whether this error is caused by the illegal iterator, the large intermediate values or theano limitations. Looking forward to your reply.
PS. My GPU is one piece of Titan X Maxwell (12GB mem).
The .theanorc and error message are provided as follows:
[global] floatX = float32 device = gpu2 optimizer = fast_compile optimizer_including = fusion allow_gc = True print_active_device = True optimizer_including = cudnn [lib] cnmem = 0.8 [dnn] enabled = True
..... Number of Convolutional layers : 103 Number of parameters : 9426191 Compilation starts at 2017-02-06 11:49:04 train compilation took 620.547 seconds valid compilation took 152.227 seconds Training starts at 2017-02-06 12:02:04
Traceback (most recent call last): File "train.py", line 307, in initiate_training(cf) File "train.py", line 276, in initiate_training train(cf) File "train.py", line 198, in train history = batch_loop(train_iter, train_fn, epoch, 'train', history) File "train.py", line 89, in batch_loop loss, I, U, acc = f(X, Y[:, None, :, :]) File "/home/qiao/anaconda2/envs/tensorflow/lib/python2.7/site- packages/theano/compile/function_module.py", line 871, in call storage_map=getattr(self.fn, 'storage_map', None)) File "/home/qiao/anaconda2/envs/tensorflow/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op reraise(exc_type, exc_value, exc_trace) File "/home/qiao/anaconda2/envs/tensorflow/lib/python2.7/site- packages/theano/compile/function_module.py", line 859, in call outputs = self.fn() MemoryError: Error allocating 503316480 bytes of device memory ( CNMEM_STATUS_OUT_OF_MEMORY). Apply node that caused the error: GpuElemwise{mul,no_inplace}(CudaNdarrayConstant{[[[[ 0.5]]]]}, GpuElemwise{add,no_inplace}.0) Toposort index: 5298 Inputs types: [CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D)] Inputs shapes: [(1, 1, 1, 1), (5, 96, 512, 512)] Inputs strides: [(0, 0, 0, 0), (25165824, 262144, 512, 1)] Inputs values: [CudaNdarray([[[[ 0.5]]]]), 'not shown'] Outputs clients: [[GpuContiguous(GpuElemwise{mul,no_inplace}.0)]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/SimJeg/FC-DenseNet/issues/1#issuecomment-277585514, or mute the thread https://github.com/notifications/unsubscribe-auth/AMnuVU3u4H8fwBNOmehY3L7KBkUIiptSks5rZqS1gaJpZM4LEX6g .
Thanks for your suggestion. But the error remains even if I changed the cnmem to 1. I looked up more docs and found out the following two that might make sense.
According to danlanchen and theano official doc, the error occurs mainly due to large number of generated intermediate values and fragmented gpu memory.
Fortunately, the error disappears after decreasing both the network depth and the number of filters.
Hi, did you try to reduce the batch size / crop size? With crops (224, 224) you should be able to train with batch size 3, with no crops (None), with batch size 1.
Le 8 févr. 2017 07:48, "Tien" notifications@github.com a écrit :
Thanks for your suggestion. But the error remains even if I changed the cnmem to 1. I looked up more docs and found out the following two that might make sense.
According to danlanchen https://danlanchen.github.io/blog/2016/10/12/training-CNN-error-handling and theano official doc http://deeplearning.net/software/theano/faq.html, the error occurs mainly due to large number of generated intermediate values and fragmented gpu memory.
Fortunately, the error disappears after decreasing both the network depth and the number of filters.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SimJeg/FC-DenseNet/issues/1#issuecomment-278243644, or mute the thread https://github.com/notifications/unsubscribe-auth/AMnuVWZHZDBT9BOl82YumPgii9koonXBks5raWUsgaJpZM4LEX6g .
@SimJeg The error indeed arises from the big size of input images. It works well if I crop the raw images in 'batch_loop' manually and then feed these cropped images to the network. I should have paid attention to my own training image size (╯□╰). Thanks again for your kind help.
You're welcome. Please also note that there is a parameter for cropping in the config file so you don't need to do it manualy
Le 9 févr. 2017 07:11, "Tien" notifications@github.com a écrit :
@SimJeg https://github.com/SimJeg The error indeed arises from the big size of input images. It works well if I crop the raw images in 'batch_loop' manually and then feed these cropped images to the network. I should have paid attention to my own training image size (╯□╰). Thanks again for your kind help.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SimJeg/FC-DenseNet/issues/1#issuecomment-278556358, or mute the thread https://github.com/notifications/unsubscribe-auth/AMnuVZEK8NWV4sEdMWSRS5pSZmWLj515ks5raq31gaJpZM4LEX6g .
Looking forward to the dataset_loaders.