yjxiong / tsn-pytorch

Temporal Segment Networks (TSN) in PyTorch
BSD 2-Clause "Simplified" License
1.06k stars 311 forks source link

trained models in pytorch #6

Open nationalflag opened 7 years ago

nationalflag commented 7 years ago

Would you share the trained models in pytorch?

yjxiong commented 7 years ago

Good point. I will give the models on UCF101 split1 for your references.

nationalflag commented 7 years ago

Thanks! I‘m very expect for it!

shuangshuangguo commented 6 years ago

@yjxiong Hi, thank you for your wonderful job!!! Could you tell me where to download the pretrained models in pytorch???

ntuyt commented 6 years ago

@yjxiong Hi, Xiong. Also need the pretrained models in pytorch. Thanks so much!

yjxiong commented 6 years ago

We don't have all pretrained UCF101/HMDB51 models in the PyTorch format for download. I can convert the Caffe pretraned models in the original TSN codebase to PyTorch format in next few weeks.

shuangshuangguo commented 6 years ago

I have finished a complicated script for transferring caffemodel to pytorch. When I use the transferred pytorchmodel on tsn-pytorch code, I have got the same result as paper. If you desire pytorchmodel immediately, please see ‘’https://github.com/gss-ucas/caffe2pytorch-tsn'‘

ntuyt commented 6 years ago

@gss-ucas Thanks

scenarios commented 6 years ago

@yjxiong Thanks for your efforts. I recently parsed your pretrained caffemodel (ucf split 1) into tensorflow with google protobuf. And I constructed the rgb stream in tensorflow with every layer identical to the one in your caffe train-val protocal. However, the accuracy is still 10% lower than the caffe version. (Padding strategy is different in caffe and tf, but it is properly solved by manually padding in tf.) I'm wondering is there any other details I should take care? Thanks!

BTW, @gss-ucas it seems that maxpooling with floor mode is used in your implementation, which is not consistent with the caffe version. But strange you can still reproduce the results lol

victorhcm commented 6 years ago

@scenarios what data augmentations are you using? Is your model BN-Inception (+TSN)? If so, are you enabling partial BN when finetuning?

Victor Hugo

On Fri, Nov 3, 2017 at 6:07 AM, scenarios notifications@github.com wrote:

@yjxiong https://github.com/yjxiong Thanks for your efforts. I recently parsed your pretrained caffemodel (ucf split 1) into tensorflow with google protobuf. And I constructed the rgb stream in tensorflow with every layer identical to the one in your caffe train-val protocal. However, the accuracy is still 10% lower than the caffe version. (Padding strategy is different in caffe and tf, but it is properly solved by manually padding in tf.) I'm wondering is there any other details I should take care? Thanks!

BTW, @gss-ucas https://github.com/gss-ucas it seems that maxpooling with floor mode is used in your implementation, which is not consistent with the caffe version. But strange you can still reproduce the results lol

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yjxiong/tsn-pytorch/issues/6#issuecomment-341641372, or mute the thread https://github.com/notifications/unsubscribe-auth/ADoEbq73ODXXGhJgSLkhn7smgpjf6Vr6ks5sysm6gaJpZM4PCiVp .

scenarios commented 6 years ago

@victorhcm Actually I'm not finetuning. I simply initialize the my BN-Inception model in tensorflow with the parameter released by TSN (it is a caffemodel, I parsed it using protobuf) and do online test without any training. At the very begining, the accuracy is only around 0.6 with 3 segment. Then I realized there is a slight difference between caffe and tensorflow in padding (both for convolution and pooling). After modifying the pading , the accuracy increased to 71%, which is still 10% lower than the caffe resuts.(I test the same model with TSN home-brewed caffe and got 0.81 accuracy with 3 segment). I have double-checked every layer and for sure they are identical to the model definition in train_val protocal in TSN (or there must be errors when loading the parameter). Still confused why..

scenarios commented 6 years ago

@victorhcm And for testing, I resize the frame to 256x340 and central crop to 224x224. Mean is subtracted. And I also change rgb image to bgr format for consistence.

SmartPorridge commented 6 years ago

@yjxiong I use the default command you provided to train my RGBDiff、RGB and Flow model with tsn-pytorch, could you please tell me if they were initialized on ImageNet?

yjxiong commented 6 years ago

@JiqiangZhou Yes.

SmartPorridge commented 6 years ago

@yjxiong thank you! I will read the code carefully.

3DMM-ICME2023 commented 6 years ago

I run the default command and obtain a lower performance than the paper(85.12 vs 85.7) on the rgb stream. The model is available at http://pan.baidu.com/s/1eSvo8BS . Hope it helps.

utsavgarg commented 6 years ago

@yjxiong Even I trained the models using default settings for split-1 of the UCF-101 dataset and am getting a lower performance than reported. Below are the numbers I got:

RGB - 85.57% Flow - 84.26% RGB + Flow - 90.44%

The main difference seems to be because of the Flow stream, what could be the reason for this?

yjxiong commented 6 years ago

@utsavgarg Please follow the instructions in Caffe TSN to extract optical flow.

https://github.com/yjxiong/tsn-pytorch/issues/30

utsavgarg commented 6 years ago

@yjxiong I had done that, I extracted optical flows using the extract_optical_flow.sh script included with Caffe TSN but I used extract_cpu instead of extract_gpu in build_of.py, will that cause this difference in performance?

yjxiong commented 6 years ago

Yes that's the problem. Always use extract_gpu. The optical flow algorithm you need is not available in extract_cpu.

SmartPorridge commented 6 years ago

@utsavgarg extract_gpu use TVL1 optical flow and it leads to a better accuracy. I did a experiment for that.

utsavgarg commented 6 years ago

Okay thanks.

Tord-Zhang commented 6 years ago

Did anyone get the pytorch pretrained models on ActivityNet or Kinetics? @yjxiong @nationalflag @gss-ucas @ntuyt @scenarios

Fairasname commented 6 years ago

Hello, are there any available models trained in ufc?

Thanks!

shuangshuangguo commented 6 years ago

@Ivorra please see https://github.com/gss-ucas/caffe2pytorch-tsn

Fairasname commented 6 years ago

@gss-ucas Thanks for kindly answering, I am having troubles in "test_models.py", as after loading your model "ucf101_rgb.pth" it seems that some keys do not exist, for with the command example:

python test_models.py ucf101 RGB <ucf101_rgb_val_list> ucf101_rgb.pth --arch BNInception --save_scores <score_file_name>

I get the error:

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py:514: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior. ownstate[name].copy(param) Traceback (most recent call last): File "test_models.py", line 54, in print("model epoch {} best prec@1: {}".format(checkpoint['epoch'], checkpoint['best_prec1'])) KeyError: 'epoch'

And if I comment that line, another key error appears:

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py:514: UserWarning: src is not broadcastable to dst, but they have the same number of elements. Falling back to deprecated pointwise behavior. ownstate[name].copy(param) Traceback (most recent call last): File "test_models.py", line 56, in base_dict = {'.'.join(k.split('.')[1:]): v for k,v in list(checkpoint['state_dict'].items())} KeyError: 'state_dict'

Thanks!

poppingcode commented 6 years ago

I wonder how to fuse? who can help me? thanks

Fairasname commented 6 years ago

@gss-ucas Alrights, I did not noticed the test_models.py file in your repository. I just substituted it buy the one this repository has. Moreover, accuracy for UCF101 - Fold 1 works as it should, given your provided converted models and the reference at the official project page:

Thanks!

Fairasname commented 6 years ago

@poppingcode I followed the instructions at the original TSN repository. Just copied the eval_scores.py file provided there in conjunction with the folder pyActionRecog which is needed as a dependency.

Hope it works fine for you!

SinghGauravKumar commented 6 years ago

@gss-ucas Hi, did you manage to convert kinetics pretrained models (as shown at http://yjxiong.me/others/kinetics_action/ ) to pytorch too?

Fairasname commented 6 years ago

Maybe this Caffe --> PyTorch model convertor is worth looking at:

https://github.com/marvis/pytorch-caffe

SinghGauravKumar commented 6 years ago

@Ivorra Installing the right version of caffe has been a pain. Wondering if @yjxiong , @gss-ucas can provide the kinetics pretrained models for pytorch?

shuangshuangguo commented 6 years ago

@SinghGauravKumar You can follow the instructions to convert, as in [https://github.com/gss-ucas/caffe2pytorch-tsn]

Modify some code should work.

sipan17 commented 6 years ago

@yjxiong is there any pre-trained model on HMDB51 dataset?

shuangshuangguo commented 6 years ago

@sipan17 Please see https://github.com/shuangshuangguo/caffe2pytorch-tsn

sipan17 commented 6 years ago

@shuangshuangguo thank you, will try that.

TiJoy commented 5 years ago

I run the default command and obtain a lower performance than the paper(85.12 vs 85.7) on the rgb stream. The model is available at http://pan.baidu.com/s/1eSvo8BS . Hope it helps.

I have a question that there is a error when I decompression my ucf101_bninception__rgb_checkpoint.pth.tar. tar: This does not look like a tar archive tar: Skipping to next header tar: Exiting with failure status due to previous errors what should I do?

yjxiong commented 5 years ago

@TiJoy

You don't need to uncompress it. pth.tar is just the extension of model files used by PyTorch.

TiJoy commented 5 years ago

@TiJoy

You don't need to uncompress it. pth.tar is just the extension of model files used by PyTorch.

Thank you, I know that I just need to delet .tar in its file name.

linshuheng6 commented 5 years ago

@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command.

Thank you!

linshuheng6 commented 5 years ago

@liu666666 Hello, I have tried your model but I only get the result of 76.13 for RGB. I think there is something wrong with my code. Could you please give me your parameter of args?

cbasemaster commented 5 years ago

hallo i use kinetics pretrain shared above to be fined tune with ucf split 1, however the accuracy is still low like this, Testing Results: Prec@1 56.732 Prec@5 84.967 Loss 2.49515

while training is

Loss 0.2026 (0.3246) Prec@1 93.750 (90.549) Prec@5 98.438 (97.904) after 80 epoch 64 sized minibatch

do you know where is gone wrong here?

Screen Shot 2019-08-02 at 18 44 10

imnotk commented 5 years ago

@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command.

Thank you!

I can only get 78% RGB accuracy using split 1 on UCF101, how do you solve it? It's very wierd.

linshuheng6 commented 5 years ago

@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command. Thank you!

I can only get 78% RGB accuracy using split 1 on UCF101, how do you solve it? It's very wierd.

@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command. Thank you!

I can only get 78% RGB accuracy using split 1 on UCF101, how do you solve it? It's very wierd.

I have not change the data sample in the original code. In the original code the data sample was set as [b*crops,c,h,w], What do you mean [b,t,c,h,w]? I only found the input_mean = [104,117,123]. The input_std has not been found. Could you please provide it to me? Thank you!

linshuheng6 commented 5 years ago

@imnotk by the way, the trainded model give 97.8% accuracy on the training set ucf101-split1. It means this model and basic parameter in the code is useful? I still have not found the reason why it work badly on the test set...

nanhui69 commented 5 years ago

@liu666666 have you extract the ucf101 RGB-Frames by 'bash scripts/extract_optical_flow.sh SRC_FOLDER OUT_FOLDER NUM_WORKER' when i do it ,enconter lots of problems, could you share the RGB-frame dataset of ucf101 ?

nanhui69 commented 5 years ago

@linshuheng6 Are you reproduce the TSN project successful ?could i communicate with you by QQ or others?

linshuheng6 commented 4 years ago

@nanhui69 I used the ffmpeg command to extract the frame in the video, not the script provided by the quthor. I have not reproduced TSN, I just used the source code and run the test document. You can contact me by send email to huins_shu@sjtu.edu.cn.

Shumpei-Kikuta commented 4 years ago

@Ivorra Hello, I have tried the converted models from caffe. But I only got 75% accuracy on ucf101-split1 with only RGB model. Could you share the parameters of args with me? And how did you extract frame from video? I just used ffmepg command. Thank you!

I can only get 78% RGB accuracy using split 1 on UCF101, how do you solve it? It's very wierd.

@imnotk Have you solved this problem? I just got the same accuracy 78% for RGB using split 1 on UCF101.

linshuheng6 commented 4 years ago

@Shumpei-Kikuta I have sovled this problem by extract frames from video by opencv command instead of ffmpeg

Shumpei-Kikuta commented 4 years ago

@linshuheng6 Thank you for sharing. I got stuck in extracting frames as that repository shows, so I've used this one: http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data Anyway, thank you!