This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""
Hello, thanks for your work to push this area. However, I can not reimplement your inference result, like shower.json in examples/
The json is in applement files.
shower_ori.json
I got the result as follows:
And I also find that when get the bbox smaller or larger, the generated texts gonna be uncontrolble, I'm curious about the reason:
this is the result of get bbox width smaller: the json is
shower_small.json
this is the result of get bbox width larger: the json is
shower_large.json
Hello, thanks for your work to push this area. However, I can not reimplement your inference result, like shower.json in examples/ The json is in applement files. shower_ori.json
I got the result as follows:![image](https://github.com/AIGText/Glyph-ByT5/assets/41320257/ebbc0f10-3f11-4d15-b30a-05f2522469de)
And I also find that when get the bbox smaller or larger, the generated texts gonna be uncontrolble, I'm curious about the reason: this is the result of get bbox width smaller: the json is shower_small.json![image](https://github.com/AIGText/Glyph-ByT5/assets/41320257/8cb42e6d-202e-4762-8d4d-df4d63d6576d)
this is the result of get bbox width larger: the json is shower_large.json![image](https://github.com/AIGText/Glyph-ByT5/assets/41320257/ffa01d55-7abe-40f0-b163-78c6e423797b)