deepinfra / deepinfra-node

Official TypeScript wrapper for DeepInfra Inference API
https://deepinfra.com/
MIT License
6 stars 0 forks source link

Support regular TextToImage #2

Closed ichernev closed 3 months ago

ichernev commented 4 months ago

It's not crystal clear from the docs, but there are two interfaces for TextToImage (ImageGeneration) models:

Currently SDXL is very popular, so it makes sense to keep it's custom input specification intact. However, it should have a base model with Custom in the name with type-argument the concrete SDXL type (so other similar custom models can be added in the future). I'm not 100% sure how to do this in TS, but I can do some research.

And for the regular TextToImage (named ImageGeneration here) models you can have the ImageGenerationBaseModel, which has a common API across many models.

ichernev commented 4 months ago

The main difference between the COG-based custom models and other models, is that cog expects all the input parameters under the input key, whereas regular text-to-image expects the arguments directly in the body (like embeddings and text-generation).

ovuruska commented 3 months ago

Thanks for the clarifications, @ichernev. Your input has been invaluable in enhancing the functionality and usability of the package.

The improvements you've suggested, including updating the documentation, adding a base model named Custom based on the SDXL type, and conducting research on how to implement this in TypeScript, will all be incorporated in the upcoming release by this weekend.