mxin262 / Bridging-Text-Spotting

(CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.
https://arxiv.org/pdf/2404.04624.pdf
Other
45 stars 0 forks source link

What's the difference between the raw dataset and the rotated dataset? #6

Closed pd162 closed 5 months ago

pd162 commented 5 months ago

Thanks for your outstanding work! I have noticed that the previous works use rotated datasets, like Total-Text. I want to know the difference between them and why the rotated images perform better than raw images.

mxin262 commented 5 months ago

The difference between them is that the rotated datasets are enhanced by rotation. The rotated images may improve the performance with the data enhancement.

pd162 commented 5 months ago

The difference between them is that the rotated datasets are enhanced by rotation. The rotated images may improve the performance with the data enhancement.

Thanks for your reply. I have conducted an experiment for inferencing the Total-Text rather than Total-Text (rotated) and keeping the same test annotation. It decreased 1.0% on detection-only evaluation and 0.7% on e2e evaluation. So I don't think the training stage brings the difference.

And I wonder the earliest paper who uses datasets of the rotated version. Maybe that paper could answer this question.

mxin262 commented 5 months ago

The rotated datasets are from DPText-DETR.

pd162 commented 5 months ago

The rotated datasets are from DPText-DETR.

Thanks!