feat: added stable diffusion pipeline (WIP)

RSMNYS commented 2 months ago

Initial implementation of the stable diffusion.
Integration to the new pipeline flow (WIP)

github-actions[bot] commented 2 months ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

freedomtan commented 2 months ago

Please also try to make the main.cc (https://github.com/mlcommons/mobile_app_open/blob/master/flutter/cpp/binary/main.cc) work
fp16/dynamic range quant models might be faster, https://github.com/freedomtan/keras_cv_stable_diffusion_to_tflite/blob/main/convert_to_tflite_models_with_dynamic_range.py

we have the keras diffusion/unet model in https://github.com/mlcommons/mobile_model_closed/releases/tag/alpha, concatenate the two into one. It should be trivial to load into the stable diffusion pipeline.diffusion_model

for running tflite with one diffusion/unet model, see my previous example (https://github.com/freedomtan/keras_cv_stable_diffusion_to_tflite/blob/main/text_to_image_with_tflite_models_from_huggingface.ipynb)

anhappdev commented 2 months ago

@RSMNYS You can test if your implementation work on desktop by running this cmd after updating the paths:

bazel build -c opt --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 //flutter/cpp/binary:main //mobile_back_tflite:tflitebackend

bazel-bin/flutter/cpp/binary/main EXTERNAL stable_diffusion \
--mode=PerformanceOnly \
--output_dir="output" \
--model_file="/Users/anh/Downloads/stable-diffusion-android/mobile_model_closed" \
--lib_path="bazel-bin/mobile_back_tflite/cpp/backend_tflite/libtflitebackend.so" \
--input_tfrecord="mobile_back_apple/dev-resources/stable_diffusion/coco_gen_full.tfrecord" \
--input_clip_model="mobile_back_apple/dev-resources/stable_diffusion/clip_model_512x512.tflite"

RSMNYS commented 2 months ago

"mobile_back_apple/dev-resources/stable_diffusion/coco_gen_full.tfrecord" \

@anhappdev can you please provide these resources:

"mobile_back_apple/dev-resources/stable_diffusion/coco_gen_full.tfrecord" "mobile_back_apple/dev-resources/stable_diffusion/clip_model_512x512.tflite"

RSMNYS commented 2 months ago

found here: https://drive.google.com/drive/folders/10zCF7_ctUIM7xVPyhw5P60utnvWVUcSy?usp=sharing Thanks

RSMNYS commented 2 months ago

found here: https://drive.google.com/drive/folders/10zCF7_ctUIM7xVPyhw5P60utnvWVUcSy?usp=sharing Thanks

@anhappdev what does the coco_gen_full.tfrecord contain? The tokenized prompts?

anhappdev commented 2 months ago

@anhappdev what does the coco_gen_full.tfrecord contain? The tokenized prompts?

@RSMNYS You can find the description here: https://github.com/mlcommons/mobile_app_open/blob/239f92c615dd36eaba25198cd49ee5c8fbf197a2/flutter/cpp/datasets/coco_gen_utils/generate_tfrecords.py#L60-L72

RSMNYS commented 2 months ago

@freedomtan @anhappdev for the stable diffusion process we need unconditional tokens. I guess we need to get them from the clip model we used to get the encoded prompts? If yes, we need to include them to the TF record?

RSMNYS commented 2 months ago

@freedomtan @anhappdev for the stable diffusion process we need unconditional tokens. I guess we need to get them from the clip model we used to get the encoded prompts? If yes, we need to include them to the TF record?

actually those can be generated on the fly as I see. 49406 49407 with the max allowed size.

freedomtan commented 2 months ago

@freedomtan @anhappdev for the stable diffusion process we need unconditional tokens. I guess we need to get them from the clip model we used to get the encoded prompts? If yes, we need to include them to the TF record?

The unconditional context is the output of the text encoder.

RSMNYS commented 2 months ago

@freedomtan @anhappdev for the stable diffusion process we need unconditional tokens. I guess we need to get them from the clip model we used to get the encoded prompts? If yes, we need to include them to the TF record?

The unconditional context is the output of the text encoder.

I was talking about the unconditional tokens. Which we then pass to text encoder to get the unconditional context. But as I've mentioned above, those could be generated on the fly [49406 49407..... 49407]

RSMNYS commented 2 months ago

@freedomtan @anhappdev Guys, the flow is working, at least tested on the desktop. Work that should be done: