facebookincubator / AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Apache License 2.0
4.55k stars 367 forks source link

The result of the StableDiffusionAITPipeline cannot be fixed with a seed value. #271

Open ssotabe opened 1 year ago

ssotabe commented 1 year ago

Even if Seed is given to generator to generate, the result changes slightly each time.

def torch_fix_seed(seed=42):
    # Python random
    random.seed(seed)
    # Numpy
    np.random.seed(seed)
    # Pytorch
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.use_deterministic_algorithms = True
torch_fix_seed()

Fixing torch seed with such code does not seem to fix the result.

example: image image

Where is the randomness left?

hl475 commented 1 year ago

Thanks for reporting! @terrychenism , can you please help take a look?

terrychenism commented 1 year ago

where did you apply torch_fix_seed?

ssotabe commented 1 year ago

in demo.py I tried two patterns, one before the pipe was created and the other just before the image was generated, but neither made any difference.

def torch_fix_seed(seed=42):
        # Python random
        random.seed(seed)
        # Numpy
        np.random.seed(seed)
        # Pytorch
        torch.manual_seed(seed)
        torch.cuda.manual_seed(seed)
        torch.backends.cudnn.deterministic = True
        torch.use_deterministic_algorithms = True

def run(local_dir, width, height, prompt, benchmark):
    torch_fix_seed() # <-
    pipe = StableDiffusionAITPipeline.from_pretrained(
        local_dir,
        scheduler=EulerDiscreteScheduler.from_pretrained(
            local_dir, subfolder="scheduler"
        ),
        revision="fp16",
        torch_dtype=torch.float16,
    ).to("cuda")
    generator = torch.Generator("cuda").manual_seed(1)
    with torch.autocast("cuda"):
        torch_fix_seed() # <-
        image = pipe(prompt, height, width,generator=generator).images[0]
        if benchmark:
            t = benchmark_torch_function(10, pipe, prompt, height=height, width=width)
            print(f"sd e2e: {t} ms")

    image.save("example_ait.png")

if __name__ == "__main__":
    run()
terrychenism commented 1 year ago

what if put torch_fix_seed inside main?

ssotabe commented 1 year ago

There was no change. code

def torch_fix_seed(seed=42):
    # Python random
    random.seed(seed)
    # Numpy
    np.random.seed(seed)
    # Pytorch
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.use_deterministic_algorithms = True
@click.command()
@click.option(
    "--local-dir",
    default="./tmp/diffusers-pipeline/stabilityai/stable-diffusion-v2",
    help="the local diffusers pipeline directory",
)
@click.option("--width", default=512, help="Width of generated image")
@click.option("--height", default=512, help="Height of generated image")
@click.option("--prompt", default="A vision of paradise, Unreal Engine", help="prompt")
@click.option(
    "--benchmark", type=bool, default=False, help="run stable diffusion e2e benchmark"
)
def run(local_dir, width, height, prompt, benchmark):

    pipe = StableDiffusionAITPipeline.from_pretrained(
        local_dir,
        scheduler=EulerDiscreteScheduler.from_pretrained(
            local_dir, subfolder="scheduler"
        ),
        revision="fp16",
        torch_dtype=torch.float16,
    ).to("cuda")
    generator = torch.Generator("cuda").manual_seed(1)
    with torch.autocast("cuda"):
        image = pipe(prompt, height, width,generator=generator).images[0]
        if benchmark:
            t = benchmark_torch_function(10, pipe, prompt, height=height, width=width)
            print(f"sd e2e: {t} ms")

    image.save("example_ait.png")

def main():
    torch_fix_seed()
    run()

if __name__ == "__main__":
    main()
CanyonWind commented 1 year ago

hi any update on this? Is there any chance triton causes the issue

terrychenism commented 1 year ago

you can try to disable use_mem_eff https://github.com/facebookincubator/AITemplate/blob/main/examples/05_stable_diffusion/src/modeling/attention.py#L72

Stax124 commented 1 year ago

you can try to disable use_mem_eff https://github.com/facebookincubator/AITemplate/blob/main/examples/05_stable_diffusion/src/modeling/attention.py#L72

I tried that but it is still non deterministic. Latents generate correctly with the fixed seed. It must be somewhere in the AIT code, not the SD example, as this is the only point where torch.randn is called (and correct generator is passed down to the steps process). I will try to do some more debugging this week, I really need that to work for VoltaML.