Implement mnemonic to image pipeline - Githubissues

StephanAkkerman / FluentAI

Automating language learning with the power of Artificial Intelligence. This repository presents FluentAI, a tool that combines Fluent Forever techniques with AI-driven automation. It streamlines the process of creating Anki flashcards, making language acquisition faster and more efficient.

https://akkerman.ai/FluentAI/

MIT License

9 stars 1 forks source link

Implement mnemonic to image pipeline #16

Closed StephanAkkerman closed 3 weeks ago

StephanAkkerman commented 3 months ago

We should make sure that we find the optimal model in speed and results:

The model should fit on an 8GB VRAM GPU (such as the GTX 1080)
The loading should take max 1 minute
The pipe parameters are optimized for mnemonic (the image does not need to be spectatular, just memorable)
Look into quantized options: https://huggingface.co/aifoundry-org/FLUX.1-schnell-Quantized
SD medium 3.5: https://huggingface.co/stabilityai/stable-diffusion-3.5-medium

To test we should run a prompt on all models and compare the results.

https://huggingface.co/black-forest-labs/FLUX.1-dev https://huggingface.co/black-forest-labs/FLUX.1-schnell

https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux

StephanAkkerman commented 2 months ago

See discussions about lower VRAM / performance:

StephanAkkerman commented 2 months ago

https://huggingface.co/rupeshs/FLUX.1-schnell-openvino-int4

StephanAkkerman commented 2 months ago

https://huggingface.co/Kijai/flux-fp8

StephanAkkerman commented 2 months ago

https://gist.github.com/AmericanPresidentJimmyCarter/873985638e1f3541ba8b00137e7dacd9

StephanAkkerman commented 2 months ago

https://huggingface.co/city96/FLUX.1-dev-gguf

StephanAkkerman commented 2 months ago

Could use: https://github.com/william-murray1204/stable-diffusion-cpp-python?tab=readme-ov-file#flux-image-generation In combination with quantized models like: https://huggingface.co/aifoundry-org/FLUX.1-schnell-Quantized

StephanAkkerman commented 2 months ago

Installation instructions: Download CUDA 12.6 (or other version) from: https://developer.nvidia.com/cuda-downloads Make sure everything is set correctly on the system path.

$env:CMAKE_ARGS="-DSD_CUBLAS=ON -DCMAKE_GENERATOR_TOOLSET='cuda=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6'"
pip install stable-diffusion-cpp-python --upgrade --force-reinstall --no-cache-dir --verbose

In case of CMake error do this: https://stackoverflow.com/questions/56636714/cuda-compile-problems-on-windows-cmake-error-no-cuda-toolset-found

StephanAkkerman commented 2 months ago

Also try downloading with local_dir instead of cache_dir: https://huggingface.co/docs/huggingface_hub/en/guides/download#download-files-to-a-local-folder

StephanAkkerman commented 2 months ago

Let's try without the stable-diffusion-cpp as installing it is a hell... For now stick with pure huggingface packages: https://gist.github.com/AmericanPresidentJimmyCarter/873985638e1f3541ba8b00137e7dacd9

StephanAkkerman commented 2 months ago

Other option: https://huggingface.co/HighCWu/FLUX.1-dev-4bit

StephanAkkerman commented 2 months ago

Other options:

StephanAkkerman commented 2 months ago

Could save our own 4bit model using: https://gist.github.com/Stella2211/10f5bd870387ec1ddb9932235321068e / https://huggingface.co/Kijai/flux-fp8/discussions/7

StephanAkkerman commented 2 months ago

or use these 4 bit safetensors: https://huggingface.co/argmaxinc/mlx-FLUX.1-schnell-4bit-quantized/tree/main

StephanAkkerman commented 2 months ago

Leaderboard text2img: https://artificialanalysis.ai/text-to-image/arena

StephanAkkerman commented 1 month ago

Medium will be released 29 oct