Trouble training RIN on CIFAR-10

Hi,

I'm currently trying to train RIN on CIFAR-10 using the code and config that was provided in the repository.

I did some minor changes to the code to get it to work:

fix some import statements (Diff)
use dummy data for FID since i currently don't have the cifar10_stats_real.npy file (Diff)
fix a bug in image_diffusion_model.py where the Model.sample calls a method that is only defined in ModelT (Diff)

The entire diff can be seen here: https://github.com/google-research/pix2seq/compare/main..leon-w:b6609 This was the exact code I used for training and evaluation.

Training Setup

I trained the model using this command:

python run.py \
 --config configs/config_diffusion_cifar10.py \
 --mode train \
 --model_dir results/cifar10 \
 --config.train.checkpoint_epochs 5 \
 --config.train.keep_checkpoint_max 2

train_log.txt

The training finishes after around 2 hours. These are the training curves logged to Tensorboard:

Eval Setup

I then run the trained model in evaluation mode to create a few samples using:

python run.py \
 --config configs/config_diffusion_cifar10.py \
 --mode eval \
 --model_dir results/cifar10 \
 --config.eval.steps 1

eval_log.txt

Unfortunately, the generated samples don't seem to contain anything meaningful and only look like pure noise:

I also tried training the model 10x longer but still got only noise.

Has anyone successfully trained a RIN model using this code base before and has any idea how I can get this to work? Any help would be highly appreciated!

google-research / pix2seq

Trouble training RIN on CIFAR-10 #42

Training Setup

Eval Setup