invoke-ai / InvokeAI

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
https://invoke-ai.github.io/InvokeAI/
Apache License 2.0
23.79k stars 2.45k forks source link

Translating stable-diffusion to another language #516

Closed gakowalski closed 2 years ago

gakowalski commented 2 years ago

Hello, is it possible to introduce support for a language other than English? If so, how? Is it necessary to retrain on the entire dataset with translated image labels?

hipsterusername commented 2 years ago

Because the model is trained by learning the semantic connection between labels and images, I have to imagine the only way to truly create "support" for other languages would to retrain on the entire dataset.

bmaltais commented 2 years ago

Write prompt in any language, Google translate to English and use result as prompt.

lstein commented 2 years ago

@hipsterusername has it right. To train the model on another language would be incredibly time- and labor-intensive. Basically the same effort as the original work.

On the other hand, @bmaltais has the germ of a great idea. We could stick in calls to the Google Translation AI (or something similar) and if the user enters their access token for payment, we could autodetect the prompt language and convert it to English behind the scenes.

hipsterusername commented 2 years ago

I'd have folks interested in that use Google Translate manually for a time to see how effective it is and report back. My concern would be that the game of "semantic telephone" played between input, google translation, and output might mean that it ends up not getting used as much as a result - Especially since so much of "good prompt engineering" involves just tossing in a bunch of qualifiers + stylistic terms.

Any-Winter-4079 commented 2 years ago

I'd have folks interested in that use Google Translate manually for a time to see how effective it is and report back. My concern would be that the game of "semantic telephone" played between input, google translation, and output might mean that it ends up not getting used as much as a result - Especially since so much of "good prompt engineering" involves just tossing in a bunch of qualifiers + stylistic terms.

@hipsterusername I think there would be things that are definitely lost, but something like "a photo of Emma Stone in Grand Theft Auto" is going to be pretty simple to translate. Here's an example, writing it in Spanish. The translation to English is perfect.

Screenshot 2022-09-16 at 01 05 14

What you say about stylistic terms is an important part to consider, though. Maybe those would need a different treatment (like choosing options from a menu in their own language, to make sure the right words are used in English)?

What I don't know is how demanded/used the feature would be as most people that use this repo need to understand English, right? To get the environment set up, sort out errors & performance issues... [I'm assuming people that know programming know English to some degree]

The more we go into making an executable / app / Docker image, with less and less setup for the user (which right now is documented only in English I believe), the more people with lower tech knowledge that can use it, removing the barrier to somewhat understand English. But maybe not and for many people, their own language is more convenient. I really don't know :)

Any-Winter-4079 commented 2 years ago

The idea itself (without menus, etc. for keywords) doesn't seem difficult, though. I'm sure there's free libraries that let you translate.