pondruska / DeepTracking

Source code of DeepTracking research project
129 stars 49 forks source link

Question the section "Unsupervised Training" #1

Closed 1132520084 closed 8 years ago

1132520084 commented 8 years ago

I have read your paper "Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks". but I don't understand the approach called “unsupervised training”. In the section “Unsupervised Training”, you said "we propose to train the network not to predict the current state , but a state in the future" , can you tell me the difference between them ? Thank you. And is the Equation 7 used in the unsupervised training ? Thank you.

pondruska commented 8 years ago

The difference is small but important. The intuition goes as follows: during training we provide the network a sensor input x up to some point t and then instead of trying to directly predict desired output y at time t we make the network to imagine the expected output a few timesteps later at time t+n (i.e. we make the network to imagine the expected future). It turns out that when the network gets good at this "imagination task" it performs well with tracking the objects in occlusions as both of these tasks require learned "imagination capability".

Equation 7 is used during the unsupervised training as well but with a slight modification that we penalise the network based on how good it is at imagining the visible part of future output. The following 2 pictures showing the difference should help:

supervised training

unsupervised training

Is this more clear?

ypxie commented 7 years ago

Thanks for the explanation, but it seems to me that the network should learn to predict the future input x{t+n} rather than the binary mask y{t+n}. Why does it work? There seems no such supervision at all.