Trouble Getting Module To Work

Hello,

I am trying to use this python module to either denoise or recreate the input data. I have the following code to try and train your model to denoise a few images coming from the CIFAR10 dataset just to make sure I can carry out this basic task. Yet I am not getting my expected results.

import torch
import torchvision
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import TensorDataset, DataLoader
from denoising_diffusion_pytorch import Unet, GaussianDiffusion

# location of the CIFAR10 dataset
data = iter(torchvision.datasets.CIFAR10("/local/data/williamjudge/images/",))

# Setting the device to use
device=torch.device('cuda:0')

# Just collecting 2 images from the dataset, normalizing from 0-1
d = []
for i in range(2):
    images, labels = next(data)
    images = np.asarray(images)
    images = np.swapaxes(images[:, :, :], 0, -1)
    images = images - images.min()
    images = images / images.max()
    d.append(list(images))

# Creating a model
model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
).to(device)

# Using GaussianDiffusion
diffusion = GaussianDiffusion(
    model,
    image_size = 32,
    timesteps = 20000,   # number of steps
    loss_type = 'l1',    # L1 or L2
    objective='pred_v',
).to(device)

# Placing the data within a torch.Tensor
training_images = np.asarray(d)
training_images = torch.Tensor(training_images).to(device)

loss = diffusion(training_images.cuda())
loss.backward()
# after a lot of training

# Confusion
sampled_images = diffusion.sample(batch_size = 4)
sampled_images.shape # (4, 3, 128, 128)

# Viewing Data
plt.imshow(training_images[0][0].cpu())
plt.show()

plt.imshow(sampled_images[0][0].cpu())
plt.show()

Since the training_images and the sampled_images do not come close to matching, even after setting timesteps to 20,000, this has lead to confusion about a couple of areas:

is the diffusion.sample() training the model and returning the predictions on the training data?
if I want a model to recreate the input data I am assuming that the objective should = 'pred_x0'. Is this assumption correct?
if I want a model to predict the noise in the data I am assuming that the objective should = 'pred_noise'. Is this assumption correct?
if diffusion.sample() is training the model and returning predictions, how can I use the trained model to just output predictions without the training step?

lucidrains / denoising-diffusion-pytorch

Trouble Getting Module To Work #179