peiyunh / tiny

Tiny Face Detector, CVPR 2017
1.13k stars 319 forks source link

hr_res101('train') error with gpu,cuda,ubuntu #22

Closed niamul070 closed 7 years ago

niamul070 commented 7 years ago

I am getting the following error. Can you help. I changed the gpus=[1 2 3 4] to gpus=[1] in the hr_res101('train') because it was saying device id from 1:1 is needed. Now I am getting the following error (at the bottom of the page):


ans =


Trying to initialize the structure of resnet-101-simple Unknown model: cannot initialize. Loading pretrained weights from ./trained_models/imagenet-resnet-101-dag.mat Loaded imdb from data/widerface/imdb.mat cluster path: data/widerface/RefBox_N25_scaled.mat

opts =

struct with fields:

  keepDilatedZeros: 0
         inputSize: [500 500]
      learningRate: [1×30 double]
           trainFn: '@cnn_train_dag_hardmine'
     batchGetterFn: '@cnn_get_batch_hardmine'
      freezeResNet: 0
               tag: ''
        clusterNum: 25
       clusterName: 'scaled'
           bboxReg: 1
        skipLRMult: [0 1 0.1000]
        sampleSize: 256
       posFraction: 0.5000
         posThresh: 0.7000
         negThresh: 0.3000
            border: [0 0]
 pretrainModelPath: './trained_models/imagenet-resnet-101-dag.mat'
           dataDir: 'data/widerface'
         modelType: 'resnet-101-simple'
       networkType: 'dagnn'
batchNormalization: 1
  weightInitMethod: 'gaussian'
    minClusterSize: [10 10]
    maxClusterSize: [Inf Inf]
            expDir: 'models/widerface-resnet-101-simple-sample256-posfrac0.5-N25-bboxreg-cluster…'
         batchSize: 12
     numSubBatches: 1
         numEpochs: 50
              gpus: 1
   numFetchThreads: 8
              lite: 0
          imdbPath: 'data/widerface/imdb.mat'
             train: [1×1 struct]

ans =

struct with fields:

            gpus: 1
       batchSize: 12
   numSubBatches: 1
       numEpochs: 50
    learningRate: [1×30 double]
keepDilatedZeros: 0

Start using dagnn.DetLoss for loss cnn_train_dag_hardmine: resetting GPU train: epoch 01: 1/1074:Invalid MEX-file '/purcell1/mbaqui/Documents/tiny/utils/compute_dense_overlap.mexa64': dlopen: cannot load any more object with static TLS.

Error in cnn_get_batch_hardmine (line 378) iou = compute_dense_overlap(ofx,ofy,stx,sty,vsx,vsy,...

Error in cnn_widerface>getDagNNBatch (line 258) [images, clsmaps, regmaps] = batchGetter(imagePaths, imageSizes, labelRects, ...

Error in cnn_widerface>@(x,y)getDagNNBatch(batchGetter,bopts,useGpu,x,y) (line 243) fn = @(x,y) getDagNNBatch(batchGetter, bopts,useGpu,x,y) ;

Error in cnn_train_dag_hardmine>process_epoch (line 268) inputs = state.getBatch(, batch) ;

Error in cnn_train_dag_hardmine (line 148) [stats.train(epoch),prof] = process_epoch(net, state, opts, 'train') ;

Error in cnn_widerface (line 212) [net, info] = trainFn(net, imdb, getBatchFn(batchGetter, opts, net.meta), ...

Error in hr_res101 (line 42) cnn_widerface('inputSize', inputSize, ...

LeeRock commented 7 years ago

I guess you you get something wrong in the previous step, "compile_mex" has a wrong result.

peiyunh commented 7 years ago

I've never seen this before. Does running compile_mex give you any error?

niamul070 commented 7 years ago

compile_mex did not give any error. I also can run the bbox function to see the selfie.jpg. But when I run the train script (hr_res101.m) then only I am getting this.

peiyunh commented 7 years ago

compute_dense_overlap is only called during training. I'm not sure why MATLAB gives such error.

peiyunh commented 7 years ago

Let me know if you solved it.

wolfworld6 commented 7 years ago

hello,where did you get the model file hr_res101.mat?

peiyunh commented 7 years ago

@wolfworld6 Please see the latest README or tiny_face_detector.m.

takecareofbigboss commented 7 years ago

@niamul070 Let me know if you have solver this problem, i have met the same problem... Or could you please contact me by email:

niamul070 commented 7 years ago

No it was not solved. They closed the ticket without solving the issue. I think this problem arises for a specific version of Matlab. I was using Matlab 2016b academic edition. I think "peiyunh" wrote the program with a different version of Matlab.

On Thu, Aug 31, 2017 at 1:13 AM, bigboss wrote:

@niamul070 Let me know if you have solver this problem, i have met the same problem... Or could you please contact me by email:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread .

takecareofbigboss commented 7 years ago

@peiyunh hey, could you plz tell us the version of your matlab? Such that we can follow ur step. @niamul070 and I met the same problem, and we thought it owes to the differences between our version of matlab. THXs.

peiyunh commented 7 years ago

Here is my MATLAB version: (R2016b). It is unlikely an issue of MATLAB.

@takecareofbigboss and @niamul070, did you compile the MEX file on your system following the

wolfworld6 commented 7 years ago

I am so sorry to reply you now,my MATLAB version: (R2016b) I have compiled the MEX


On 9/1/2017 13:26,Peiyun wrote:

Here is my MATLAB version: (R2016b). It is unlikely an issue of MATLAB.

@takecareofbigboss and @niamul070, did you compile the MEX file on your system following the

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

peiyunh commented 7 years ago

Can you post your compilation error or runtime error here?

peiyunh commented 7 years ago

I added a test script for compute_dense_overlap. Try this in MATLAB:

>> cd utils;
>> compile_mex;
>> test_compute_dense_overlap;

to see if it works now.

takecareofbigboss commented 7 years ago

yep, it solved!!! appreciate your help... @peiyunh

takecareofbigboss commented 7 years ago

@niamul070 you can try it again, all the problems are solved for me.