bes-dev / stable_diffusion.openvino

Apache License 2.0
1.53k stars 206 forks source link

Suggestion: Add --seed for generating consistent images and testing the effect of hyper-params #3

Closed Norod closed 2 years ago

Norod commented 2 years ago

For example, the following implementation will allow a seed to be specified:


def main(args):
    if args.seed is not None:
        np.random.seed(args.seed)
    scheduler = LMSDiscreteScheduler(
        beta_start=args.beta_start,
        beta_end=args.beta_end,
        beta_schedule=args.beta_schedule,
        tensor_format="np"
    )
    stable_diffusion = StableDiffusion(
        model = args.model,
        scheduler = scheduler,
        tokenizer = args.tokenizer
    )
    image = stable_diffusion(
        prompt = args.prompt,
        num_inference_steps = args.num_inference_steps,
        guidance_scale = args.guidance_scale,
        eta = args.eta
    )
    cv2.imwrite(args.output, image)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    # pipeline configure
    parser.add_argument("--model", type=str, default="bes-dev/stable-diffusion-v1-4-openvino", help="model name")
    # scheduler params
    parser.add_argument("--beta-start", type=float, default=0.00085, help="LMSDiscreteScheduler::beta_start")
    parser.add_argument("--beta-end", type=float, default=0.012, help="LMSDiscreteScheduler::beta_end")
    parser.add_argument("--beta-schedule", type=str, default="scaled_linear", help="LMSDiscreteScheduler::beta_schedule")
    # diffusion params
    parser.add_argument("--num-inference-steps", type=int, default=32, help="num inference steps")
    parser.add_argument("--guidance-scale", type=float, default=7.5, help="guidance scale")
    parser.add_argument("--eta", type=float, default=0.0, help="eta")
    # tokenizer
    parser.add_argument("--tokenizer", type=str, default="openai/clip-vit-large-patch14", help="tokenizer")
    # prompt
    parser.add_argument("--prompt", type=str, default="Street-art painting of Emilia Clarke in style of Banksy, photorealism", help="prompt")
    # seed
    parser.add_argument("--seed", type=int, default=None, help="Random seed for generating consistent images per prompt")
    # output name
    parser.add_argument("--output", type=str, default="output.png", help="output image name")
    args = parser.parse_args()
    main(args)

Which allows for comparing 10 vs. 32 steps per images with the same prompt:

Seed: 420 | Steps: 32 output_32_420 Seed: 420 | Steps: 10 output_10_420 Seed: 1000 | Steps: 32 output_32_1000 Seed: 1000 | Steps: 10 output_10_1000 Seed: 42 | Steps: 10 output_10_42 Seed: 42 | Steps: 32 output_32_42

bes-dev commented 2 years ago

Done