yajiemiao / pdnn

PDNN: A Python Toolkit for Deep Learning. http://www.cs.cmu.edu/~ymiao/pdnntk.html
Apache License 2.0
224 stars 105 forks source link

我运行出的dnn.classify文件查看都是数字相同的数组(已解决,我的数据集有问题,抱歉) #39

Open lovivi opened 8 years ago

lovivi commented 8 years ago

我使用example给的参数,数据集换成了自己的,或者是如下 marker_train = np.array([[0.2, 0.3, 0.5, 1.4], [1.3, 2.1, 0.3, 0.1], [0.3, 0.5, 0.5, 1.4]], dtype = 'float32') y_train = np.array([2, 0, 1]) marker_va = np.array([[0.2, 0.3, 0.8, 1.4], [1.3, 2.3, 0.3, 0.1], [0.3, 0.5, 0.2, 1.4]], dtype = 'float32') y_va = np.array([4, 3, 1]) marker_te = np.array([[0.6, 0.3, 0.5, 1.4], [1.3, 2.3, 0.1, 0.1], [0.3, 0.5, 1.2, 1.4]], dtype = 'float32') y_te = np.array([2, 0, 0]) with gzip.open('train.pickle.gz', 'wb') as f:cPickle.dump((marker_train, label), f) 输出的结果 aaa=gzip.open("/home/malab5/Software/pdnn/examples/mnist/dnn.classify.pickle.gz") bbb = cPickle.load(aaa)

bbb array([[ 1.], [ 1.], [ 1.]], dtype=float32) 都是这种数字一模一样的数字,我知道是哪个参数出错了

训练的代码

!/bin/bash

pdnndir=/home/malab5/Software/pdnn # pointer to PDNN device=gpu0

export environment variables

export PYTHONPATH=$PYTHONPATH:$pdnndir export THEANO_FLAGS=mode=FAST_RUN,device=$device,floatX=float32

train DNN model

echo "Training the DNN model ..."

python $pdnndir/cmds/run_DNN.py --train-data "train.pickle.gz,partition=600m,random=true" \

--valid-data "valid.pickle.gz,partition=600m,random=true" \

--nnet-spec "1000:1024:1024:1024:1024:1024:1000" \

--wdir ./ --param-output-file dnn.mdl

echo "Training the DNN model ..."

train DNN model

train DNN model

echo "Training the DNN model ..." python $pdnndir/cmds/run_DNN.py --train-data "train.pickle.gz" \ --valid-data "valid.pickle.gz" \ --nnet-spec "4:20:20:1" --wdir ./ \ --l2-reg 0.1 --lrate "C:10:20" --model-save-step 10 \ --param-output-file dnn.param --cfg-output-file dnn.cfg >& dnn.training.log

classification on the testing data; -1 means the final layer, that is, the classification softmax layer

echo "Classifying with the DNN model ..." python $pdnndir/cmds/run_Extract_Feats.py --data "test.pickle.gz" \ --nnet-param dnn.param --nnet-cfg dnn.cfg \ --output-file "dnn.classify.pickle.gz" --layer-index 10 \ --batch-size 10 >& dnn.testing.log

python show_results.py dnn.classify.pickle.gz

train的结果 /usr/local/lib/python2.7/dist-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice. warnings.warn("Mean of empty slice.", RuntimeWarning) [2016-04-19 15:32:57.491671] > epoch 1, training error nan (%) [2016-04-19 15:32:57.491899] > epoch 1, lrate 10.000000, validation error nan (%) [2016-04-19 15:32:57.491951] > epoch 2, training error nan (%) [2016-04-19 15:32:57.491981] > epoch 2, lrate 10.000000, validation error nan (%) [2016-04-19 15:32:57.492019] > epoch 3, training error nan (%) [2016-04-19 15:32:57.492051] > epoch 3, lrate 10.000000, validation error nan (%) [2016-04-19 15:32:57.492092] > epoch 4, training error nan (%) [2016-04-19 15:32:57.492117] > epoch 4, lrate 10.000000, validation error nan (%)

test结果 ERROR (theano.sandbox.cuda): nvcc compiler not found on $PATH. Check your nvcc installation and try again. /usr/local/lib/python2.7/dist-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module. "downsample module has been moved to the theano.tensor.signal.pool module.") [2016-04-19 15:32:57.740441] > ... setting up the model and loading parameters [2016-04-19 15:32:57.748392] > ... getting the feat-extraction function Traceback (most recent call last): File "/home/malab5/Software/pdnn/cmds/run_Extract_Feats.py", line 78, in extract_func = model.build_extract_feat_function(layer_index) File "/home/malab5/Software/pdnn/models/dnn.py", line 179, in build_extract_feat_function out_da = theano.function([feat], self.layers[output_layer].output, updates = None, givens={self.x:feat}, on_unused_input='warn') IndexError: list index out of range

georgid commented 6 years ago

To eradicate the error about downsample change

from theano.tensor.signal import downsample
...
pooled_out = downsample.max_pool_2d( ... )

to

from theano.tensor.signal import pool
...
pooled_out = pool.pool_2d( ... )

Theano has changed its API for the 0.9.0 version. pdnn haven't updated yet.

max_pool_2D method doesn't exist anymore in ``downsample".

I found the solution for this issue in Theano/Theano#4337