ZhaofanQiu / pseudo-3d-residual-networks

Pseudo-3D Convolutional Residual Networks for Video Representation Learning
MIT License
352 stars 119 forks source link

Question about how to extract the feature. #7

Open bjtuweb-12283027 opened 6 years ago

bjtuweb-12283027 commented 6 years ago

Hi, I am learning how to extract the video feature from your proposed P3d. What's the format of input? Can you show more details such as the usage. Thank you very much.

ZhaofanQiu commented 6 years ago

Hi, currently, I am trying to make some guidelines to apply P3D in different scenarios. If you want to have a quick look at the usage of P3D, I can share the Matlab code for feature extraction, you can make your input similar to the format in this code.

function features = extract_feature_single_video(net, video_path, blob_names)
% net: Caffe network object
% video_path: Original video file
% blob_names: cell(n, 1) save n blob names to extract

clip_length = 16;
min_edge = 162;
crop_size = 160;
max_sample = 20;
min_sample = 20;

video_obj = VideoReader(video_path);
duration = video_obj.NumberOfFrames;
height = video_obj.Height;
width = video_obj.Width;
ratio = min_edge / min(height, width);
resize_height = ceil(height * ratio);
resize_width = ceil(width * ratio);

sample_num = floor(duration / clip_length);
sample_num = min(sample_num, max_sample);
sample_num = max(sample_num, min_sample);

stride = (duration - clip_length - 2) / sample_num;

features = cell(1, length(blob_names));

for i = 1 : sample_num
    start_frame = floor((i - 1) * stride) + 1;
    end_frame = start_frame - 1 + clip_length;

    clip = zeros(resize_width, resize_height, clip_length, 3, 1);
    for j = start_frame : end_frame
        img = read(video_obj, j);
        img = my_imresize(img, resize_height, resize_width);
        img = permute(img, [2, 1, 3]);
        img = img(:, :, [3, 2, 1]);
        img(:, :, 1) = img(:, :, 1) - 104;
        img(:, :, 2) = img(:, :, 2) - 117;
        img(:, :, 3) = img(:, :, 3) - 123;
        clip(:, :, j - start_frame + 1, :, 1) = reshape(img, [resize_width, resize_height, 1, 3, 1]);
    end

    s_h = floor((resize_height - crop_size) / 2);
    e_h = s_h + crop_size - 1;
    s_w = floor((resize_width - crop_size) / 2);
    e_w = s_w + crop_size - 1;
    clip = clip(s_w : e_w, s_h : e_h, :, :, :);

    input_blob_vec = {clip};
    net.forward(input_blob_vec);
    for k = 1 : length(blob_names)
        data = net.blob_vec(net.name2blob_index(blob_names{k})).get_data();
        if (i == 1)
            features{k} = data(:)';
        else
            features{k} = features{k} + data(:)';
        end
    end
end

for k = 1 : length(blob_names)
    features{k} = features{k} / sample_num;
end
bjtuweb-12283027 commented 6 years ago

Sorry for bothering you again. This following error happened after I adding the layers and proto parameters: CXX src/caffe/layers/bn_layer.cpp src/caffe/layers/bn_layer.cpp: In instantiation of 'void caffe::BNLayer:: LayerSetUp(const std::vector<caffe::Blob>&, const std::vector<caffe::Blo b>&) [with Dtype = float]': src/caffe/layers/bn_layer.cpp:347:2: required from here src/caffe/layers/bn_layer.cpp:12:11: error: 'class caffe::LayerParameter' has no member named 'bnparam' frozen = this->layerparam.bn_param().frozen(); ^ src/caffe/layers/bn_layer.cpp:13:16: error: 'class caffe::LayerParameter' has no member named 'bn_param' bnmomentum = this->layerparam.bn_param().momentum(); ^ src/caffe/layers/bn_layer.cpp:14:11: error: 'class caffe::LayerParameter' has no member named 'bn_param' bneps = this->layerparam.bn_param().eps(); ^ src/caffe/layers/bn_layer.cpp:28:60: error: 'class caffe::LayerParameter' has no member named 'bn_param' shared_ptr<Filler > slope_filler(GetFiller( ^ src/caffe/layers/bn_layer.cpp:33:59: error: 'class caffe::LayerParameter' has no member named 'bn_param' shared_ptr<Filler > bias_filler(GetFiller( ^ src/caffe/layers/bn_layer.cpp: In instantiation of 'void caffe::BNLayer:: LayerSetUp(const std::vector<caffe::Blob>&, const std::vector<caffe::Blo b>&) [with Dtype = double]': src/caffe/layers/bn_layer.cpp:347:2: required from here src/caffe/layers/bn_layer.cpp:12:11: error: 'class caffe::LayerParameter' has no member named 'bnparam' frozen = this->layerparam.bn_param().frozen(); ^ src/caffe/layers/bn_layer.cpp:13:16: error: 'class caffe::LayerParameter' has no member named 'bn_param' bnmomentum = this->layerparam.bn_param().momentum(); ^ src/caffe/layers/bn_layer.cpp:14:11: error: 'class caffe::LayerParameter' has no member named 'bn_param' bneps = this->layerparam.bn_param().eps(); ^ src/caffe/layers/bn_layer.cpp:28:60: error: 'class caffe::LayerParameter' has no member named 'bn_param' shared_ptr<Filler > slope_filler(GetFiller( ^ src/caffe/layers/bn_layer.cpp:33:59: error: 'class caffe::LayerParameter' has no member named 'bn_param' shared_ptr<Filler > bias_filler(GetFiller( ^ make: *** [.build_release/src/caffe/layers/bn_layer.o] Error 1 mmc_fqi@mmc-All-Series:/mnt/pan_sdf1/D_windows/Software/code/cafe/caffe_20160725$ caffe::LayerParameter'caffe::LayerParameter' caffe::LayerParametercaffe::LayerParameter: command not found

ZhaofanQiu commented 6 years ago

Hi, it seems that the bn_param cannot be found in LayerParameter, You should make sure that you have added the BNparameter into you LayerParameter in "caffe.proto". Best.

bjtuweb-12283027 commented 6 years ago

It solved.Thank you very much!

bjtuweb-12283027 commented 6 years ago

Did you define the my_imresize function?

ZhaofanQiu commented 6 years ago

Sorry, I forgot to explain this function. This is a simple implementation of "imresize", to avoid the usage of matlab image toolbox. You can replace that line as "img = imresize(img, [resize_height, resize_width]);"

Best.

bjtuweb-12283027 commented 6 years ago

Thank you. It works for using p3d_resnet_kinetics_iter_190000 model. But how to work on p3d_resnet_kinetics_flow_iter_284000?

ZhaofanQiu commented 6 years ago

Hi, for the optical flow, you cannot use this code since you need extract the optical flow image first. We choose the gpu-implementation of TVL1 optical flow in OpenCV, you can refer to https://github.com/wanglimin/dense_flow . After that, we combine the optical-flow in x&y direction together as a two-channel image. For each clip, 16 consecutive these two-channel images are inputted.

tingtinh commented 6 years ago

@ZhaofanQiu Could you share the code for extracting features from the optical flow images? Thank you.