Add first-pass at stability tritonserver-based imagegen comp

acwrenn commented 2 weeks ago

Description

This PR adds a new component for image generation using Stability. The API is currently a WIP - lots of feedback appreciated!

Issues

NA

Type of change

[ ] Bug fix (non-breaking change which fixes an issue)
[ x] New feature (non-breaking change which adds new functionality)
[ ] Breaking change (fix or feature that would break existing design and interface)

Dependencies

Tritonserver Stable Diffusion = "Habana/stable-diffusion-2"

Tests

WIP

acwrenn commented 2 weeks ago

Hey maintainers - this is clearly not in a high-quality state yet. But I want to get architecture/code location/generic feedback on the approach before I spend the time to polish it. Any notes would be greatly appreciated!

This Stable Diffusion model powers a card in the Intel AI Explorer - so upstreaming it to OPEA seemed like a reasonable next step.

https://or-dev.dcs-tools-experiments.infra-host.com/explore

mkbhanda commented 1 week ago

Please create distinct microservices for the pipeline components, for instance triton server should be in its own microservice. Data cleaning, embedding, model server, model etc. The genAIexample contains a pipeline composed of microservices in the GenAImicrocomps and we need e2e tests for everything.

acwrenn commented 1 week ago

Please create distinct microservices for the pipeline components, for instance triton server should be in its own microservice. Data cleaning, embedding, model server, model etc. The genAIexample contains a pipeline composed of microservices in the GenAImicrocomps and we need e2e tests for everything.

This is an interesting question - because the tritonserver portion acts as a replacement for the TGI services in the LLM examples - which are not distinct comps contained in this repo.

So - there seems to be a clear distinction between a service that performs inference, and a service that glues other parts together. I dont think that requiring the buisness-logic service onto an accelerator-having-host is a good idea.

So - I guess my question is one looking for clarity. There should be a comp that is JUST the model server, and then keep the ImageGen API container as a different comp?

opea-project / GenAIComps