Closed YantianZha closed 8 years ago
Hi, I have the same problem, any suggestions ? Thank you!
Has anyone done the pre-processing in Python?
@YantianZha @frajem @GerardoHH Hi. I am sorry for all the delay. I was very busy with my thesis and graduation. I am no longer at the University of Toronto but will try to reply regularly here.
Hi, I am using the h5py package to save the deep features extracted from the GoogLeNet. The file size of my training data set is about 39 GB while yours is about 8.6 GB. Could you please point out the possible reasons behind such discrepancy? BR.
If I were you, I would just return back to use the matlab code. BTW: I'm also using h5py, and have been facing the same issue.
@YantianZha I think I have found the root cause for this discrepancy. In case you use h5py, please check this website. You may set the parameter compression to gzip, the file size will drop dramatically. Matlab should have applied some compression algorithms by default.
@kracwarlock
Hi, thank you for the matlab scripts (I've to modify them a little bit ), I finally generate the h5 files for UCF11. Now, when I run the script: "THEANO_FLAGS='floatX=float32,device=gpu0,mode=FAST_RUN,nvcc.fastmath=True' python -m scripts.evaluate_ucf11 "
I got the error:
src/actrec.py:744: FutureWarning: comparison to None
will result in an elementwise object comparison in the future.
if x == None:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/caffe/action-recognition-visual-attention-master/scripts/evaluate_ucf11.py", line 87, in
Backtrace when the node is created: File "src/actrec.py", line 415, in build_model cost = -tensor.log(probs[tensor.arange(n_timesteps*n_samples), tmp] + 1e-8)
I think the error is related to the features stored in the h5 file.
Any sugestions ?
Thank you for yor help!
@GerardoHH This error is because for some variable x the code is trying to access some "variable"[11] but "variable" is of size 11 (x[0] to x[10]). Most probably you have n_actions or some hyperparameter set incorrectly (smaller than its actual size in your dataset) in you evaluate_ucf11.py file.
@kracwarlock
Hi, thank you for your reply, It was my fault at generating the labels file. Now the script is running, I suppose it will take 1 day to finish. I'll let you know my results.
Thank you for all your help. And by the way, I'd like to read your thesis, I'm doing my Ph.D and I need state of art documentation XD.
@GerardoHH That's good news and you are welcome :)
Also, my thesis is available at http://www.cs.toronto.edu/~shikhar/publications/msc-thesis.pdf. It is not state of the art but it is what it is. Attention models still have a long way to go.
@kracwarlock @GerardoHH Hi, I downloaded the matlab script and tried to generate feature data with it, but met with some problems.
How to get the model_def_file
and model_file
? The default value for model_def_file
is '/u/yukun/Projects/RCNN/caffe/examples/GoogleLeNet/forward_googlenet_outputconv.prototxt'
and the other is '/u/yukun/Projects/RCNN/caffe/examples/GoogleLeNet/imagenet_googlenet.caffemodel'
, but I can't find these two files in my caffe directory.
So I try to use $caffe_root/models/bvlc_googlenet/deploy.prototxt
and bvlc_googlenet.caffemodel
, but when I run this matlab script, I get this error input data/diff size does not match target blob shape, input data/diff size: [ 224 224 3 128 ] vs target blob shape: [ 224 224 3 10 ]
.
It seems that I used the wrong model files, but where to get the proper ones? Any help would be greatly appreciated.
@Litchiware
I faced the same problem, just match the shape of the blobs to [224 224 3 10] .
Hi @kracwarlock
I finished the training and testing of UCF-11 using 30% - 30% - 30% in testing, training and validation. My results are Accuracy: Train 1.0 Valid 0.930124223602 Test 0.957115009747
I think its OK. XD Thank you for your help.
@Litchiware I think Yukun used the Princeton model (http://vision.princeton.edu/pvt/GoogLeNet/ImageNet/)[http://vision.princeton.edu/pvt/GoogLeNet/ImageNet/]. Try with @GerardoHH 's change.
@GerardoHH That's great :)
Hi @GerardoHH Can you share all the scripts need to generate the h5py file. Thanks.
Hi @calmevtime
Here are the original scripts from @kracwarlock
https://github.com/kracwarlock/action-recognition-visual-attention/issues/6
Find the answare on Apr 17 , and GL XD
Hi @GerardoHH If I am using kracwarlock's matlab code to extract features, how should I modify princeton's train_val_googlenet.prototxt to extract the convolutional features, like forward_googlenet_outputconv.prototxt? I have tried to erase everything from cls1_pool layer and the output scores dim is 1 1 1000 10, it's absolutely wrong. Thanks.
hi @ae86208
You shuld not modifiy the .proto file, only instance the net in a TEST mode, and find the last convolutional layer of shape( 7,7,1024), it should be the one before the first FC layer, those are the features used for the LSTM.
Thanks a lot @GerardoHH .
Hi @GerardoHH, how much time the code spends in one batch in your training? I get about 60s per batch ( I use 128 samples per batch) which is too slow I think. I'd like to know how much time other dudes get to see if I can optimize the code.
Thank you :)
Hi @GerardoHH @ae86208 I don't quite understand how to instance the net into a TEST mode, or where to add or modify something, since in .prototxt files, data layer in phase: TEST, is batch_size: 32, crop_size: 224, mean_file: "imagenet_mean.binaryproto", while in extracting_feature.m file, it seems to have a batch_size: 4 because of imseq = cell(1,size(vidFrames,4)), and mean_file: ilsvrc_2012_mean.mat. I also feel confused about if we don't delete the original output items in prototxt, will it still match its last conv layer to the code in extractingfeature.m FeatDim = 7 7 _ 1024? Sorry I'm a caffe newbie.
@kracwarlock Thanks for your fantastic work, When i download extracting_feature.m, i was confused with bID. And, i had no idea to do with this code, could you please give some detail information. Best regards :)
Hi, @GerardoHH Can you share the code which is used to combine features to h5 format? Because the link (To combine the individual files generated by this script that he sent me I used https://gist.github.com/kracwarlock/96499936487d6125dd010319669c6648) is not available. Thanks!
Hi,
I think I need to prepare four preprocessed files (https://github.com/kracwarlock/action-recognition-visual-attention/tree/master/util). That said, I'm confused at how to get "train_features.h5".
Could you please share your related code that can do this? I would appreciated more if you can share all of the codes that do those preprocessing jobs.
Thank again!