Inquiry about Correctness of Inference Results using ICDAR13

ZYM-PKU / UDiffText

UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models

MIT License

192 stars 16 forks source link

Inquiry about Correctness of Inference Results using ICDAR13 #12

Open YesianRohn opened 3 months ago

YesianRohn commented 3 months ago

I hope this message finds you well. I've been working with your project and recently attempted to run an inference on the ICDAR13 dataset. I wanted to verify if I have set up everything correctly and if the output I'm seeing is as expected. Here's a snapshot of the results I obtained during the validation phase using the ICDAR13 Dataset.

ZYM-PKU commented 3 months ago

The results seem to be incorrect because the text mask does not cover the text region properly, and the model may generate poorer results when dealing with highly distorted texts

YesianRohn commented 3 months ago

If I use a cropped text image and use the whole image as a mask, can I achieve the MOSTEL effect?