prasunroy / stefann

:fire: [CVPR 2020] STEFANN: Scene Text Editor using Font Adaptive Neural Network (official code).
https://prasunroy.github.io/stefann
Apache License 2.0
259 stars 40 forks source link

Support for Chinese Language #14

Open lorisgir opened 3 years ago

lorisgir commented 3 years ago

Hi, thanks for your hard work on this project. It's really cool! I've seen issue #7 but I still have some doubts. I would like to try to replace english text with it's corresponding chinese translation, but how can I do so if characters are stored in jpg file named as ASCII numbers? Chinese it's not included in ASCII. Another question regarding chinese is, do I also need to generate new images, one for each character, in the colornet directory? Your help would be much appreciated!

prasunroy commented 3 years ago

Hi, thanks for your interest in our work.

About FANnet:

At this point it's a bit difficult to replace text written in one language to another. The reason is that we've assumed a character to character translation but not word to word. So it expects source and target texts are equal in length (i.e. same number of characters). Translating a word to another language may not result in a word with same character count. Effectively we want a character to character mapping. This is a major limitation of the current approach as discussed in the paper. But you can still experiment with numeric values 0-9 in different languages where a one-to-one mapping is possible.

However, the current code needs some modifications before you will be able to use such scheme. For example assume the following translations from English to Chinese numerals 0-9:

Note that we are ignoring the fact that Chinese numbers can extend beyond 9.

0 -> 〇
1 -> 一
2 -> 二
3 -> 三
4 -> 四
5 -> 五
6 -> 六
7 -> 七
8 -> 八
9 -> 九

If we cannot use ASCII values as filenames then we have to use some kind of indexing. Assume our filenames for English numeral images as en0.jpg, en1.jpg, ... , en9.jpg and the same for Chinese numeral images as cn0.jpg, cn1.jpg, ... , cn9.jpg. Also assume filenames for test pairs as 00_en0_cn0.jpg, 01_en1_cn9.jpg etc. Now make the following changes in fannet.py:

Lines 48 and 49:

SOURCE_CHARS = [f'en{i}' for i in range(10)]
TARGET_CHARS = [f'cn{i}' for i in range(10)]

Lines 106 and 107:

ch_src = str(perm[0])
ch_dst = str(perm[1])

Line 221:

idx_ch = self._charset.find(dst_ch)

Line 361:

charset=TARGET_CHARS,

PLEASE NOTE: I haven't checked this personally. So, some other minor issues may appear during training. I would like to provide a fully working notebook in future. But unfortunately for the next couple on months I won't be able to do so and I might be slow to respond.

About Colornet

Colornet doesn't depend on structure of the involved characters. So you might be able to use the provided pretrained weights without retraining! ;) But if you still want to train with new data then you should prepare your data in a format similar to the given dataset.