loss: nan, iter:1/455(1, 1.076s)

thuxugang commented 8 years ago

hello,请问为什么我使用您的程序ctc loss从一开始就为nan呢？希望您指导一下，非常感谢~ 下面是显示的内容： Using gpu device 0: GeForce GTX 980 Ti (CNMeM is disabled, cuDNN not available) C:\Anaconda\lib\site-packages\theano\tensor\signal\downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module. "downsample module has been moved to the theano.tensor.signal.pool module.") loaded 29143 samples from D:\xugang\OCR\cnn-lstm-ctc-master\dataset\english_sentence\train_img_list.txt loaded 2914 samples from D:\xugang\OCR\cnn-lstm-ctc-master\dataset\english_sentence\val_img_list.txt building symbolic tensors(0.0799999237061) setting parameters(0.0799999237061) ('n_classes: ', 95) ('multi-step: ', set([79625, 68250, 45500])) building the model(0.0799999237061) computing updates and function(0.240000009537) using normal sgd and learning_rate:0.00999999977648 ('bw_lstm_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>) ('fw_lstm_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>) ('fw_lstm_U', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>) ('fw_lstm_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>) ('bw_lstm_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>) ('bw_lstm_U', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>) ('hidden_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>) ('hidden_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>) building training function(1.78999996185) building validating function(29.6099998951) begin to train(32.8609998226) .epoch 1/200 begin(32.861) [prefetch]height: 28, x_max_step:141.0, y_max_width:50 D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\layers\utee.py:137: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future x = np.zeros((batch_size, 1, height, x_max_len)). astype(config.floatX) D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\layers\utee.py:138: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future x_mask = np.zeros((batch_size, x_max_len)).astype(config.floatX) ..loss: nan, iter:1/455(1, 1.076s) ..detect nan ..loss: nan, iter:1/455(1.076) Traceback (most recent call last): File "D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\train.py", line 150, in sys.exit() SystemExit

aaron-xichen commented 8 years ago

Maybe you can try this [global] device=gpu0 floatX=float32 Btw, please remember to backup your original .theanorc

thuxugang commented 8 years ago

感谢您的指导，刚试了一下，还是不行。请问您那里可以么？我的配置是： [global] openmp = False device = gpu floatX = float32 allow_input_downcast=True [blas] ldflags = [gcc] cxxflags = -IC:\Anaconda\MinGW [nvcc] flags = -LC:\Anaconda\libs compiler_bindir = C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin fastmath = True

thuxugang commented 8 years ago

对了，我使用的是opencv 2.4.11，请问这个有影响么。。。谢谢

aaron-xichen commented 7 years ago

Sorry for the late reply, please set fastmath = False and try again

aaron-xichen / cnn-lstm-ctc

loss: nan, iter:1/455(1, 1.076s) #8