Open nationalflag opened 7 years ago
Good point. I will give the models on UCF101 split1 for your references.
Thanks! I‘m very expect for it!
@yjxiong Hi, thank you for your wonderful job!!! Could you tell me where to download the pretrained models in pytorch???
@yjxiong Hi, Xiong. Also need the pretrained models in pytorch. Thanks so much!
We don't have all pretrained UCF101/HMDB51 models in the PyTorch format for download. I can convert the Caffe pretraned models in the original TSN codebase to PyTorch format in next few weeks.
I have finished a complicated script for transferring caffemodel to pytorch. When I use the transferred pytorchmodel on tsn-pytorch code, I have got the same result as paper. If you desire pytorchmodel immediately, please see ‘’https://github.com/gss-ucas/caffe2pytorch-tsn'‘
@gss-ucas Thanks
@yjxiong Thanks for your efforts. I recently parsed your pretrained caffemodel (ucf split 1) into tensorflow with google protobuf. And I constructed the rgb stream in tensorflow with every layer identical to the one in your caffe train-val protocal. However, the accuracy is still 10% lower than the caffe version. (Padding strategy is different in caffe and tf, but it is properly solved by manually padding in tf.) I'm wondering is there any other details I should take care? Thanks!
BTW, @gss-ucas it seems that maxpooling with floor mode is used in your implementation, which is not consistent with the caffe version. But strange you can still reproduce the results lol
@scenarios what data augmentations are you using? Is your model BN-Inception (+TSN)? If so, are you enabling partial BN when finetuning?
Victor Hugo
On Fri, Nov 3, 2017 at 6:07 AM, scenarios notifications@github.com wrote:
@yjxiong https://github.com/yjxiong Thanks for your efforts. I recently parsed your pretrained caffemodel (ucf split 1) into tensorflow with google protobuf. And I constructed the rgb stream in tensorflow with every layer identical to the one in your caffe train-val protocal. However, the accuracy is still 10% lower than the caffe version. (Padding strategy is different in caffe and tf, but it is properly solved by manually padding in tf.) I'm wondering is there any other details I should take care? Thanks!
BTW, @gss-ucas https://github.com/gss-ucas it seems that maxpooling with floor mode is used in your implementation, which is not consistent with the caffe version. But strange you can still reproduce the results lol
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yjxiong/tsn-pytorch/issues/6#issuecomment-341641372, or mute the thread https://github.com/notifications/unsubscribe-auth/ADoEbq73ODXXGhJgSLkhn7smgpjf6Vr6ks5sysm6gaJpZM4PCiVp .
@victorhcm Actually I'm not finetuning. I simply initialize the my BN-Inception model in tensorflow with the parameter released by TSN (it is a caffemodel, I parsed it using protobuf) and do online test without any training. At the very begining, the accuracy is only around 0.6 with 3 segment. Then I realized there is a slight difference between caffe and tensorflow in padding (both for convolution and pooling). After modifying the pading , the accuracy increased to 71%, which is still 10% lower than the caffe resuts.(I test the same model with TSN home-brewed caffe and got 0.81 accuracy with 3 segment). I have double-checked every layer and for sure they are identical to the model definition in train_val protocal in TSN (or there must be errors when loading the parameter). Still confused why..
@victorhcm And for testing, I resize the frame to 256x340 and central crop to 224x224. Mean is subtracted. And I also change rgb image to bgr format for consistence.
@yjxiong I use the default command you provided to train my RGBDiff、RGB and Flow model with tsn-pytorch, could you please tell me if they were initialized on ImageNet?
@JiqiangZhou Yes.
@yjxiong thank you! I will read the code carefully.
I run the default command and obtain a lower performance than the paper(85.12 vs 85.7) on the rgb stream. The model is available at http://pan.baidu.com/s/1eSvo8BS . Hope it helps.
@yjxiong Even I trained the models using default settings for split-1 of the UCF-101 dataset and am getting a lower performance than reported. Below are the numbers I got:
RGB - 85.57% Flow - 84.26% RGB + Flow - 90.44%
The main difference seems to be because of the Flow stream, what could be the reason for this?
@utsavgarg Please follow the instructions in Caffe TSN to extract optical flow.
@yjxiong I had done that, I extracted optical flows using the extract_optical_flow.sh
script included with Caffe TSN but I used extract_cpu
instead of extract_gpu
in build_of.py
, will that cause this difference in performance?
Yes that's the problem. Always use extract_gpu
. The optical flow algorithm you need is not available in extract_cpu
.
@utsavgarg extract_gpu
use TVL1 optical flow and it leads to a better accuracy. I did a experiment for that.
Okay thanks.
Did anyone get the pytorch pretrained models on ActivityNet or Kinetics? @yjxiong @nationalflag @gss-ucas @ntuyt @scenarios
Hello, are there any available models trained in ufc?
Thanks!
@Ivorra please see https://github.com/gss-ucas/caffe2pytorch-tsn
@gss-ucas Thanks for kindly answering, I am having troubles in "test_models.py", as after loading your model "ucf101_rgb.pth" it seems that some keys do not exist, for with the command example:
python test_models.py ucf101 RGB <ucf101_rgb_val_list> ucf101_rgb.pth --arch BNInception --save_scores <score_file_name>
I get the error:
/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py:514: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior.
ownstate[name].copy(param)
Traceback (most recent call last):
File "test_models.py", line 54, in
And if I comment that line, another key error appears:
/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py:514: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior.
ownstate[name].copy(param)
Traceback (most recent call last):
File "test_models.py", line 56, in
Thanks!
I wonder how to fuse? who can help me? thanks
@gss-ucas Alrights, I did not noticed the test_models.py file in your repository. I just substituted it buy the one this repository has. Moreover, accuracy for UCF101 - Fold 1 works as it should, given your provided converted models and the reference at the official project page:
Thanks!
@poppingcode I followed the instructions at the original TSN repository. Just copied the eval_scores.py file provided there in conjunction with the folder pyActionRecog which is needed as a dependency.
Hope it works fine for you!
@gss-ucas Hi, did you manage to convert kinetics pretrained models (as shown at http://yjxiong.me/others/kinetics_action/ ) to pytorch too?
Maybe this Caffe --> PyTorch model convertor is worth looking at:
@Ivorra Installing the right version of caffe has been a pain. Wondering if @yjxiong , @gss-ucas can provide the kinetics pretrained models for pytorch?
@SinghGauravKumar You can follow the instructions to convert, as in [https://github.com/gss-ucas/caffe2pytorch-tsn]
Modify some code should work.
@yjxiong is there any pre-trained model on HMDB51 dataset?
@sipan17 Please see https://github.com/shuangshuangguo/caffe2pytorch-tsn
@shuangshuangguo thank you, will try that.
I run the default command and obtain a lower performance than the paper(85.12 vs 85.7) on the rgb stream. The model is available at http://pan.baidu.com/s/1eSvo8BS . Hope it helps.
I have a question that there is a error when I decompression my ucf101_bninception__rgb_checkpoint.pth.tar.
tar: This does not look like a tar archive tar: Skipping to next header tar: Exiting with failure status due to previous errors
what should I do?
@TiJoy
You don't need to uncompress it. pth.tar
is just the extension of model files used by PyTorch.
@TiJoy
You don't need to uncompress it.
pth.tar
is just the extension of model files used by PyTorch.
Thank you, I know that I just need to delet .tar
in its file name.
@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command.
Thank you!
@liu666666 Hello, I have tried your model but I only get the result of 76.13 for RGB. I think there is something wrong with my code. Could you please give me your parameter of args?
hallo i use kinetics pretrain shared above to be fined tune with ucf split 1, however the accuracy is still low like this, Testing Results: Prec@1 56.732 Prec@5 84.967 Loss 2.49515
while training is
Loss 0.2026 (0.3246) Prec@1 93.750 (90.549) Prec@5 98.438 (97.904) after 80 epoch 64 sized minibatch
do you know where is gone wrong here?
@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command.
Thank you!
I can only get 78% RGB accuracy using split 1 on UCF101, how do you solve it? It's very wierd.
@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command. Thank you!
I can only get 78% RGB accuracy using split 1 on UCF101, how do you solve it? It's very wierd.
@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command. Thank you!
I can only get 78% RGB accuracy using split 1 on UCF101, how do you solve it? It's very wierd.
I have not change the data sample in the original code. In the original code the data sample was set as [b*crops,c,h,w], What do you mean [b,t,c,h,w]? I only found the input_mean = [104,117,123]. The input_std has not been found. Could you please provide it to me? Thank you!
@imnotk by the way, the trainded model give 97.8% accuracy on the training set ucf101-split1. It means this model and basic parameter in the code is useful? I still have not found the reason why it work badly on the test set...
@liu666666 have you extract the ucf101 RGB-Frames by 'bash scripts/extract_optical_flow.sh SRC_FOLDER OUT_FOLDER NUM_WORKER' when i do it ,enconter lots of problems, could you share the RGB-frame dataset of ucf101 ?
@linshuheng6 Are you reproduce the TSN project successful ?could i communicate with you by QQ or others?
@nanhui69 I used the ffmpeg command to extract the frame in the video, not the script provided by the quthor. I have not reproduced TSN, I just used the source code and run the test document. You can contact me by send email to huins_shu@sjtu.edu.cn.
@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command. Thank you!
I can only get 78% RGB accuracy using split 1 on UCF101, how do you solve it? It's very wierd.
@imnotk Have you solved this problem? I just got the same accuracy 78% for RGB using split 1 on UCF101.
@Shumpei-Kikuta I have sovled this problem by extract frames from video by opencv command instead of ffmpeg
@linshuheng6 Thank you for sharing. I got stuck in extracting frames as that repository shows, so I've used this one: http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data Anyway, thank you!
Would you share the trained models in pytorch?