JindongJiang / SCALOR

Official Release of ICLR 2020 paper "SCALOR: Generative World Models with Scalable Object Representations"
https://sites.google.com/view/scalor/home
MIT License
48 stars 14 forks source link

Parameters for MNIST/dSprites #3

Closed zadaianchuk closed 3 years ago

zadaianchuk commented 3 years ago

Hi, thanks a lot for your code! I'm trying to reproduce results with dSprites. Can you upload in some form the parameters used for synthetic dataset training? Thanks a lot!

JindongJiang commented 3 years ago

Hi, sure. It's also at the appendix of the paper, but I will highlight some here. I recommend using a size of 4 x 4 as the image encoding map if the number of objects is small, this will highly reduce the memory consumption and training time. Similarly, the z^what dimension can be small, e.g. 4 or 8, if your sprites are not too complex. We use a learning rate of 5e-4, and a standard deviation of 0.2 for dSprites experiments. Also, we constrain z^scale on synthetic datasets so that it can vary from half to 1.5 times the actual object size. The prior for z^pres in discovery is set to be 0.1 at the beginning of training and to quickly anneal to 1e-4.

zadaianchuk commented 3 years ago

Thanks a lot for your answer! I also was interested in the possibility of adding a bias to architecture towards propagation, e.g. if you have the same number of objects in video, is it possible to only discover in first time frame and propagate in all other? I guess it is the explained_ratio_threshold parameter that is responsible for this.

JindongJiang commented 3 years ago

Hi, sure. In that case, you can actually skip the discovery step for all timestep t > 0 and manually set the propagated z^pres to be 1. To skip the discovery, simply set the discovery variable to be zero or empty tensors.

zadaianchuk commented 3 years ago

That helped! One more thanks for the support.