Open bjtuweb-12283027 opened 6 years ago
Hi, currently, I am trying to make some guidelines to apply P3D in different scenarios. If you want to have a quick look at the usage of P3D, I can share the Matlab code for feature extraction, you can make your input similar to the format in this code.
function features = extract_feature_single_video(net, video_path, blob_names)
% net: Caffe network object
% video_path: Original video file
% blob_names: cell(n, 1) save n blob names to extract
clip_length = 16;
min_edge = 162;
crop_size = 160;
max_sample = 20;
min_sample = 20;
video_obj = VideoReader(video_path);
duration = video_obj.NumberOfFrames;
height = video_obj.Height;
width = video_obj.Width;
ratio = min_edge / min(height, width);
resize_height = ceil(height * ratio);
resize_width = ceil(width * ratio);
sample_num = floor(duration / clip_length);
sample_num = min(sample_num, max_sample);
sample_num = max(sample_num, min_sample);
stride = (duration - clip_length - 2) / sample_num;
features = cell(1, length(blob_names));
for i = 1 : sample_num
start_frame = floor((i - 1) * stride) + 1;
end_frame = start_frame - 1 + clip_length;
clip = zeros(resize_width, resize_height, clip_length, 3, 1);
for j = start_frame : end_frame
img = read(video_obj, j);
img = my_imresize(img, resize_height, resize_width);
img = permute(img, [2, 1, 3]);
img = img(:, :, [3, 2, 1]);
img(:, :, 1) = img(:, :, 1) - 104;
img(:, :, 2) = img(:, :, 2) - 117;
img(:, :, 3) = img(:, :, 3) - 123;
clip(:, :, j - start_frame + 1, :, 1) = reshape(img, [resize_width, resize_height, 1, 3, 1]);
end
s_h = floor((resize_height - crop_size) / 2);
e_h = s_h + crop_size - 1;
s_w = floor((resize_width - crop_size) / 2);
e_w = s_w + crop_size - 1;
clip = clip(s_w : e_w, s_h : e_h, :, :, :);
input_blob_vec = {clip};
net.forward(input_blob_vec);
for k = 1 : length(blob_names)
data = net.blob_vec(net.name2blob_index(blob_names{k})).get_data();
if (i == 1)
features{k} = data(:)';
else
features{k} = features{k} + data(:)';
end
end
end
for k = 1 : length(blob_names)
features{k} = features{k} / sample_num;
end
Sorry for bothering you again. This following error happened after I adding the layers and proto parameters:
CXX src/caffe/layers/bn_layer.cpp
src/caffe/layers/bn_layer.cpp: In instantiation of 'void caffe::BNLayer
Hi, it seems that the bn_param cannot be found in LayerParameter, You should make sure that you have added the BNparameter into you LayerParameter in "caffe.proto". Best.
It solved.Thank you very much!
Did you define the my_imresize function?
Sorry, I forgot to explain this function. This is a simple implementation of "imresize", to avoid the usage of matlab image toolbox. You can replace that line as "img = imresize(img, [resize_height, resize_width]);"
Best.
Thank you. It works for using p3d_resnet_kinetics_iter_190000 model. But how to work on p3d_resnet_kinetics_flow_iter_284000?
Hi, for the optical flow, you cannot use this code since you need extract the optical flow image first. We choose the gpu-implementation of TVL1 optical flow in OpenCV, you can refer to https://github.com/wanglimin/dense_flow . After that, we combine the optical-flow in x&y direction together as a two-channel image. For each clip, 16 consecutive these two-channel images are inputted.
@ZhaofanQiu Could you share the code for extracting features from the optical flow images? Thank you.
Hi, I am learning how to extract the video feature from your proposed P3d. What's the format of input? Can you show more details such as the usage. Thank you very much.