aredden / flux-fp8-api

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
Apache License 2.0
202 stars 21 forks source link

Docker image support. #17

Open ShivamB25 opened 1 month ago

ShivamB25 commented 1 month ago

please. i can also try making a image with nvida cuda as base layer

0xtempest commented 1 month ago

I have a docker setup working with this, and I have optimized API changes as well, I'm planning to make a large PR soon once I fix all the LoRA issues.

aredden commented 1 month ago

Ah cool! @0xtempest If it's a very large PR, could it be done in pieces? I would like to be able to test each change individually since for a very large PR it can be tough to figure out what is doing what. Also yeah, docker should be relatively simple to set up since you can just use a pytorch container and install the requirements.

0xtempest commented 1 month ago

Yeah I can break it down in smaller PR's, probably will be early next week as there's a lot to clean up

ShivamB25 commented 1 month ago

@0xtempest Could you please keep your fork (please add updates). and i would love to test around

ShivamB25 commented 1 month ago

@0xtempest possible ? i would also like to try helping with lora's issues