coxlab / prednet

Code and models accompanying "Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning"
https://arxiv.org/abs/1605.08104
MIT License
759 stars 259 forks source link

Prednet changing colors #44

Open david-bernstein opened 5 years ago

david-bernstein commented 5 years ago

I'm using the latest prednet master, Keras 2.0.8, and TF 1.8.0. I'm testing prednet on synthetic videos of moving rectangles. Red rectangles are fine but other colors get changed. For instance if the input video contains yellow rectangles the predicted rectangles are green, and if the input video contains white rectangles the predicted rectangles are cyan. See attached png for an example of the former. The predicted position and shape of the rectangles is fine, just the color is changed. In addition blue and green rectangles produce no predictions at all (a completely black predicted value). Strangely when I ran it on the Kitti data to test the predictions all look fine, so it is not scrambling the colors there.

plot_35

bill-lotter commented 5 years ago

Perhaps it's an issue of normalization? Are the images inputted with range 0-255 or 0-1?

david-bernstein commented 5 years ago

In the hkl file I create the images are numpy arrays of type uint8 (0-255). It looks like SequenceGenerator does the conversion to float32, 0-1.0.

bill-lotter commented 5 years ago

Hmm if you just train on yellow rectangles does it work?

david-bernstein commented 5 years ago

In that example above it just trains on yellow rectangles, no other colors.

bill-lotter commented 5 years ago

That's weird. Are you starting from pre-trained weights? Does the training error it's outputting make sense? Perhaps you can also try a lower learning rate

david-bernstein commented 5 years ago

It's initializing the model the same way as in kitti_train.

I think this would be easy to reproduce. Take the kitti dataset and as part of the sequence generator zero out two of the three channels. With a purely red image you should be fine but you should see something weird when only using the B or G channels.

bill-lotter commented 5 years ago

Sure, go ahead and try it ;)

david-bernstein commented 5 years ago

will do

bill-lotter commented 5 years ago

thanks!

david-bernstein commented 5 years ago

Well I only trained for a few epochs but nothing funny happened. When trained on only the red kitti channel prednet gave me red images and the same for the blue and green channels. I'll do more digging on why it's different for my videos.

david-bernstein commented 5 years ago

It seems to be a training issue. With more training I can get the predicted color of white squares to be white (instead of cyan). However it never seems to learn about blue squares no matter how large and blue they are.

plot_4

The same experiment with red squares comes out perfectly.

bill-lotter commented 5 years ago

Interesting thanks, my guess is that it's running into saturation/vanishing gradients issues. My suggestion would be to try a lower learning rate (like decreasing it by a factor of 10). If that doesn't work, perhaps try RMSProp for the optimizer with a lower rho than is default, maybe like 0.5.

david-bernstein commented 5 years ago

Ok, I tried those and some other optimizers as well but no luck. Also varying the learning rate. I think you're right about the vanishing gradients but it's weird why it's color specific. I'm working on a project involving video game like objects which is why I'm experimenting with videos like this. If I find a solution I'll post it here, otherwise you can close the issue. Just for the record, it works great for all sorts of videos involving red rectangles!

bill-lotter commented 5 years ago

Thanks, if only we could get it to work for blue rectangles too! If you're not sick of it yet, you could also try using a sigmoid as the activation for the pixel prediction layer (Ahat_0) instead of relu - that could potentially help with the vanishing gradients.

kegbo commented 5 years ago

I am using prednet for cloud predictions wiht grayscale images but the brighter regions of the images are enhanced more than usual . These are the actual images cloud_201902251515_v1-9 cloud_201902251545_v1-9 cloud_201902251645_v1-9 cloud_201902251615_v1-9

These are the predicted images img5 img6 img7