aimagelab / VATr

MIT License
72 stars 4 forks source link

How are the results generated #1

Closed Fran-zis-ka closed 1 year ago

Fran-zis-ka commented 1 year ago

For each character, is the result generated according to the writing sequence, i.e. stroke, or the result is generated as a whole picture? I've seen the code and seems like it's the later case, could I get more professional knowledge? Many thanks.

vittoriopippi commented 1 year ago

The network generates single words separately, and then we stitch them together, adding a gap of 16 pixels between the words of the sequence.

e.g. Given the sequence "The intelligence is not artificial", we first split the sentence into words and then generate each word separately. To generate the word "The", we feed the network with the visual archetypes of the "T", "h", and "e" combined with 15 style samples. The network will generate an image with a height of 32 pixels and a width of 16 x 3 = 48 pixels, where the word "The" is written with the given style.

Fran-zis-ka commented 1 year ago

thank you! In the case of generating character "T", since "T" is written as a horizontal line and a vertival line, will there 2 lines be generated in order, or character "T" is generated as a whole image?

vittoriopippi commented 1 year ago

A visual archetype is an image of 16 x 16 pixels in black and white. Therefore, the character "T" (and any other characters) is entirely contained in a single visual archetype and not decomposed into different strokes

Fran-zis-ka commented 1 year ago

I understand. Thank you so much for your quick response and helpful guidance. your expertise and support have been invaluable in resolving this issue:)