Open Serjio42 opened 1 year ago
Hi, our model cannot generate Arabic letters. Nonetheless, you could try to put those Arabic texts as the surrounding texts to see if the model can learn Arabic text styles. Maybe you could try to train one on Arabic text images using our code?
As I understand, you use https://github.com/clovaai/synthtiger
as a part of your pipeline. Unfortunately it can't synthesize solid-written Arabic text, as you can see on their demo image: https://user-images.githubusercontent.com/12423224/167302532-dbd5fa60-bcba-4f77-92ee-58bb6efda51c.png
PIL
can write Arabic text on an image well, but this is not what you use.
Consequently, images generated with your model will have the same problems (letters not connected with each other), am I right?
I see. Sorry, I'm not very familiar with Arabic text rendering. But since synthtiger cannot render correct arabic text as you said, our pipeline may not work. Nevertheless, I think you could prepare your arabic text data with PIL and train a diffusion model using our framework to see if it works :)
@Question406 Thanks for response! I've just read your paper and thinking about what should be done to run training pipeline with Arabic/Urdu data. Suppose Arabic dataset is ready (each sample is: GT image, masked image, text instruction included non-Latin text), what should be done in addition to launch the training? The questions that came to my head are:
@Serjio42, Hi!
I think the original CLIP tokenizer doesn't recognize non-Latin text. This multilingual version of CLIP is definitely useful. However, aligning the embedding space of this multilingual CLIP with the one stable-diffusion uses might require some training. I believe this paper could be helpful for this stage of training.
Regarding the character tokenizer, I'm not familiar with Arabic/Urdu, so I can't provide any insights on that.
You could try starting with a small-scale training to test its effectiveness, but I'm uncertain if it would require more computational resources to achieve satisfactory results.
Hi. Is it possible to maintain Arabic (left-to-right) text for for style transfer? In particular, I have several hundreds of similar mages with Arabic texts, is it possible to generate images with the same style with your repository? Thanks.