bryanyzhu / two-stream-pytorch

PyTorch implementation of two-stream networks for video action recognition
MIT License
568 stars 150 forks source link

Flow_model_accuarcy #12

Closed 3DMM-ICME2023 closed 6 years ago

3DMM-ICME2023 commented 6 years ago

hi, @bryanyzhu thanks for your nice share! Wang[1] provide a method called Cross modality pre-training which may improve the flow model performance.

[1]. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

bryanyzhu commented 6 years ago

@liu666666 Hi, thanks for your suggestion. Yes, I actually already used the cross modality pre-training technique. Right now, my resnet101 and resnet152 flow model works good. It is just vgg16 model doesn't yield same performance. So it should be something particular about the VGG network.

zed0630 commented 6 years ago

@bryanyzhu Hi, have you take the direction of optical flow into consideration? When flipping optical flow image, one should also invert the corresponding pixel value in x(horizontal)/y(vertical) channel, otherwise the direction will stay the same, which is contrast to its real direction.

bryanyzhu commented 6 years ago

Hi @zed0630 Thanks for pointing out the mistake, I should do another invert. Will update the code base here. Thanks.

swathikirans commented 6 years ago

@zed0630 Hi, I didn't get your point about inverting the pixel values in the x/y channels during flipping. Can you elaborate on this?

bryanyzhu commented 6 years ago

@swathikirans because when you flip your images, e.g, horizontal flip, the flow at x direction also changes.

You can refer to this code snippet, i think it will help.

swathikirans commented 6 years ago

@bryanzhu. Thank you!