Open barisgecer opened 8 years ago
Hi, @barisgecer When I run the fcnTrain.m(gpu mode) it is very slow around 1Hz and don't converge, what problems do you think I have? Please help me,thank you very much! CUDADevice with properties:
Name: 'GeForce GTX TITAN X'
Index: 1
ComputeCapability: '5.2'
SupportsDouble: 1
DriverVersion: 7.5000
ToolkitVersion: 6.5000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 1.2885e+10
AvailableMemory: 1.2609e+10
MultiprocessorCount: 24 ClockRateKHz: 1076000 ComputeMode: 'Default' GPUOverlapsTransfers: 1 KernelExecutionTimeout: 1 CanMapHostMemory: 1 DeviceSupported: 1 DeviceSelected: 1 train: epoch 01: 1/565: 0.9 Hz accuracy: 0.677 0.048 0.032 objective: 3.044 train: epoch 01: 2/565: 1.0 Hz accuracy: 0.691 0.048 0.033 objective: 3.035 train: epoch 01: 3/565: 1.0 Hz accuracy: 0.698 0.048 0.033 objective: 2.994
Hi, Short Question : I have a data set with arbitrary input sizes. How should I edit my input samples to keep all of them in one matrix for GPU efficiency?
Long Questions : Currently I am processing each sample individually, which of course, really inefficient. I should store them in 4D matrix (where 4th dimension is for images), but due to their varying size, I can't.
Do you think I should sample same sized inputs from the data? How should I do it?
What if I set the size of my 4D matrix to maximum size of among all input samples and fill the remaining part of other samples with zeros? What happens when we give an image of zeros to the network? Does it have any influence in learning?
Thank you.