google-deepmind / kinetics-i3d

Convolutional neural network model for video classification trained on the Kinetics dataset.
Apache License 2.0
1.75k stars 462 forks source link

Struggling to learn using Opt. Flow #122

Open mazatov opened 2 years ago

mazatov commented 2 years ago

I'm training this model on my own dataset. I trained it successfully on the RGB stream of the data. Now, I"m trying got do that on the optical flow stream. However, the model is not learning anything at all.

My optical flow output is scaled from [-1,1] and is of float32 type. I'm detecting it using method=cv2.cuda.FarnebackOpticalFlow_create( numLevels = 10, pyrScale = 0.5, winSize = 1, numIters = 20 ) for speed. By visualizing it, I can see that it is detecting something coherent. However, the model is not learning absolutely anything. Given that I'm using already pre-trained weights, my thought is that maybe the input is not what the model is expecting and hence can't learn anything because it stats on "the wrong foot". Could there be anything I'm missing regarding preprocessing steps or others?

Or is the model just very sensitive the to the flow method and it needs to have TVL1 as a flow method?

leenas233 commented 2 years ago

hello ,I am a beginner. I wonder how to get the optical flow datasets. are there any websites to download the dataset or should I process the raw video to get the optical flow data? Thanks a lot!

joaoluiscarreira commented 2 years ago

Hey Mike,

it may be worth checking if the input range is the same as the model was pretrained on. Maybe also play a bit with optimization parameters, maybe the learning rate is too high ? I assume you mean that the training loss is not going down ("the model is not learning absolutely anything").

Best,

Joao

On Wed, Mar 16, 2022 at 4:53 PM Mike Azatov @.***> wrote:

I'm training this model on my own dataset. I trained it successfully on the RGB stream of the data. Now, I"m trying got do that on the optical flow stream. However, the model is not learning anything at all.

My optical flow output is scaled from [-1,1] and is of float32 type. I'm detecting it using method=cv2.cuda.FarnebackOpticalFlow_create( numLevels = 10, pyrScale = 0.5, winSize = 1, numIters = 20 ) for speed. By visualizing it, I can see that it is detecting something coherent. However, the model is not learning absolutely anything. Given that I'm using already pre-trained weights, my thought is that maybe the input is not what the model is expecting and hence can't learn anything because it stats on "the wrong foot". Could there be anything I'm missing regarding preprocessing steps or others?

— Reply to this email directly, view it on GitHub https://github.com/deepmind/kinetics-i3d/issues/122, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXKU2WN4VLGOV47YRBTK5DVAIGYDANCNFSM5Q4PGIVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>