didriknielsen / survae_flows

Code for paper "SurVAE Flows: Surjections to Bridge the Gap between VAEs and Flows"
MIT License
283 stars 34 forks source link

loss increases much suddenly... #20

Closed zhangwenwen closed 2 years ago

zhangwenwen commented 2 years ago

SurVAE is a great job. However, When I try to run the train.py script in experiments/image/train.py with default parameters, the output looks like as following:

Files already downloaded and verified
Storing logs in: /tmp/pycharm_project_69/experiments/image/log/cifar10_8bit/pool_flow/expdecay/2022-06-08_11-00-06
Storing checkpoints in: /tmp/pycharm_project_69/experiments/image/log/cifar10_8bit/pool_flow/expdecay/2022-06-08_11-00-06/check
Training. Epoch: 1/10, Datapoint: 64/50000, Bits/dim: 9.041
Training. Epoch: 1/10, Datapoint: 128/50000, Bits/dim: 8.964
Training. Epoch: 1/10, Datapoint: 192/50000, Bits/dim: 8.857
Training. Epoch: 1/10, Datapoint: 256/50000, Bits/dim: 8.664
Training. Epoch: 1/10, Datapoint: 320/50000, Bits/dim: 1161562190115273424502784.000
Training. Epoch: 1/10, Datapoint: 384/50000, Bits/dim: 2037414059903604804812800.000
Training. Epoch: 1/10, Datapoint: 448/50000, Bits/dim: 2836264394128260119658496.000
Training. Epoch: 1/10, Datapoint: 512/50000, Bits/dim: 3274391252358852983652352.000
Training. Epoch: 1/10, Datapoint: 576/50000, Bits/dim: 3776218183503106344484864.000
Training. Epoch: 1/10, Datapoint: 640/50000, Bits/dim: 5504718090370542527840256.000
Training. Epoch: 1/10, Datapoint: 704/50000, Bits/dim: 6959116845316599288168448.000
Training. Epoch: 1/10, Datapoint: 768/50000, Bits/dim: 6643827287906464986824704.000
Training. Epoch: 1/10, Datapoint: 832/50000, Bits/dim: 6198825958360009429483520.000
Training. Epoch: 1/10, Datapoint: 896/50000, Bits/dim: 5875308456980169657155584.000
Training. Epoch: 1/10, Datapoint: 960/50000, Bits/dim: 10000572146077269571403776.000
Training. Epoch: 1/10, Datapoint: 1024/50000, Bits/dim: 9550764206660226216099840.000
Training. Epoch: 1/10, Datapoint: 1088/50000, Bits/dim: 9223264531163152969105408.000
Training. Epoch: 1/10, Datapoint: 1152/50000, Bits/dim: 8798175358573306835894272.000
Training. Epoch: 1/10, Datapoint: 1216/50000, Bits/dim: 8433389929875283667582976.000
Training. Epoch: 1/10, Datapoint: 1280/50000, Bits/dim: 8062881988078312555020288.000
Training. Epoch: 1/10, Datapoint: 1344/50000, Bits/dim: 7984914609852287987744768.000
Training. Epoch: 1/10, Datapoint: 1408/50000, Bits/dim: 8106902120649219779854336.000
Training. Epoch: 1/10, Datapoint: 1472/50000, Bits/dim: 8311160927079566321647616.000
....

Is that normal ? Look forward to your reply . Thanks

zhangwenwen commented 2 years ago

A warmup is needed!