Thanks for make this interesting project open-source. I am trying to replicate the work discussed in the paper. However, the training procedure for Hollywood2 and UCF11 data sets does not converge. I suspect that something is wrong with the extracted features.
I use Python interface of Caffe to extract the features from layer "inception_5b/output" of GoogLeNet. The shape of the features is (1024, 7, 7). According to other forum posts, the shape should be (7, 7, 1024). So I have swapped the axes of the features accordingly. Is that the difference between MATLAB interface and Python interface?
Among the 1024 feature maps, appropriately 35% of them only consist of zeros. Is it normal?
In the Matlab script, how do you define the name of the feature layer that you intend to use, such as "inception_5b/output"? The script simply uses scores = caffe('forward', {input_data{i}});.
Thanks for make this interesting project open-source. I am trying to replicate the work discussed in the paper. However, the training procedure for Hollywood2 and UCF11 data sets does not converge. I suspect that something is wrong with the extracted features.
scores = caffe('forward', {input_data{i}});
.Any help would be greatly appreciated :-)