ku21fan / STR-Fewer-Labels

Scene Text Recognition (STR) methods trained with fewer real labels (CVPR 2021)
MIT License
173 stars 27 forks source link

Unlabeled lmdb dataset #13

Closed AtiBabaie closed 1 year ago

AtiBabaie commented 1 year ago

Hi

I want to create my own dataset in Persian language. I have used create_lmdb_dataset.py for labeled data and it worked. but for unlabeled data that we do not have any labels, what must gtfile.txt contain? Or should I try another code?

Thanks in advance.

ku21fan commented 1 year ago

Hi,

Since we do not use the label part of unlabeled data, the label part is not important. Thus, any dummy label, such as [dummy_label] or something, will be fine. When I created the lmdb dataset for unlabeled data, I used the label [Unlabeled_data] as shown below.

2346250-0.jpg   [Unlabeled_data]
2346250-1.jpg   [Unlabeled_data]
2346250-2.jpg   [Unlabeled_data]
...

Hope it helps :) Best

AtiBabaie commented 1 year ago

Thank you for responding so quickly! Yes I have tried this way and it worked but I was not sure if it's right. Thanks again :)