nqanh / affordance-net

AffordanceNet - Multiclass Instance Segmentation Framework - ICRA 2018
121 stars 54 forks source link

demo.py / Check failed: error == cudaSuccess (2 vs. 0) out of memory *** Check failure stack trace: *** Aborted (core dumped) #19

Open ambl2357 opened 6 years ago

ambl2357 commented 6 years ago

I was the one who asked the other question.
Now I am in another difficulty. start 'demo.py' in your code 'affordance net'

but,There is an error like the picture. default

Do you have this error? If so, how did you resolve it?

thank you, 고마워요~

nvidia-smi 2

cf) i started python demo.py --gpu 1

nqanh commented 6 years ago

Try CUDA_VISIBLE_DEVICES=1 python dempy.py. The --gpu option from Caffe not always work.

ambl2357 commented 6 years ago

I tried. But it does not. maybe need i to modify testprototxt?

i solved modify config.py

ambl2357 commented 6 years ago

@nqanh I've made a lot of attempts with the f-measure evaluation code you helped me with. However, the experimental results are different from the paper. (too much) Excuse me, but can you get the code?

nqanh commented 6 years ago

Here is the matlab code we use, make sure you change the params based on your dataset, and be careful with the Matlab index (starts from 1):

function F_wb_non_rank = evaluate_Fwb_non_rank(path_predited, path_gt)

% affordances index
aff_start=2;   % ignore {background} label
aff_end=10;   % change based on the dataset 

% get all files
list_predicted = getAllFiles(path_predited);   % get all files in current folder
list_gt = getAllFiles(path_gt);
list_predicted = sort(list_predicted);
list_gt = sort(list_gt); % make the same style
assert(length(list_predicted)==length(list_gt)); % test length
num_of_files = length(list_gt);

F_wb_aff = nan(num_of_files,1);
F_wb_non_rank = [];

for aff_id = aff_start:aff_end  % from 2 --> final_aff_id
    for i=1:num_of_files

        fprintf('affordance id=%d, image i=%d \n', aff_id, i);
        fprintf('current pred: %s\n', list_predicted{i});
        fprintf('current grth: %s\n', list_gt{i});

        %%read image      
        pred_im = imread(list_predicted{i}); 
        gt_im = imread(list_gt{i});

        fprintf('size pred_im: %d \n', size(pred_im));
        fprintf('size gt_im  : %d \n', size(gt_im));

        pred_im = pred_im(:,:,1);
        gt_im = gt_im(:,:,1);

        targetID = aff_id - 1; %labels are zero-indexed so we minus 1

        % only get current affordance
        pred_aff = pred_im == targetID;
        gt_aff = gt_im == targetID;

        if sum(gt_aff(:)) > 0 % only compute if the affordance has ground truth
            F_wb_aff(i,1) = WFb(double(pred_aff), gt_aff);  % call WFb function
            %fprintf('no ground truth at i=%d \n', i);

    fprintf('Averaged F_wb for affordance id=%d is: %f \n', aff_id-1, nanmean(F_wb_aff));
    F_wb_non_rank = [F_wb_non_rank; nanmean(F_wb_aff)];


superchenyan commented 6 years ago

@nqanh I suspect there is something wrong with the test code. I use the data you provided for training. When I test different iterations of the model, some models report “syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory”. When I use the cpu test the failed model, the same image, the same model sometimes reported different results, apparently due to the problem of memory cross-border. .

nqanh commented 6 years ago

You should test after training for at least 50K iterations, and it's also strongly recommended to a GPU with enough memory.

superchenyan commented 6 years ago

@nqanh @ambl2357 Do you solved this problem? I test the model of 60K iterations, it still report "error == cudaSuccess (2 vs. 0) out of memory", I wonder why ? I don't think it make sense because of different iterations.

ambl2357 commented 6 years ago

@superchenyan i solved modify config.py , C.TRAIN.BATCH_SIZE = 32 to , __C.TRAIN.BATCH_SIZE = 16 when i was train. and when i test, C.TEST.MAX_SIZE = 1000 to 500

superchenyan commented 6 years ago

@nqanh I'v fine-tuned imagenet pre-trained model with my custom dataset, and I changed the name of bbox_pred layer with 'bbox_pred_face' The problem comes from the snapshot wrapper in train.py. This wrapper only works if your bbox_pred layer is named 'bbox_pred'. so the training prcess is wrong. I wish others donot make the same mistake as I did.

litingsjj commented 6 years ago

@ambl2357 I meet the same problem, modify config.py, it worked?