open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox
https://mmocr.readthedocs.io/en/dev-1.x/
Apache License 2.0
4.35k stars 751 forks source link

How to use synthtext dataset to train text detection model?? When i try to make it according to docs,something i cannot understand happened. #1156

Open manjaro-git opened 2 years ago

manjaro-git commented 2 years ago

load index 24016 with error too many values to unpack (expected 2)prepare index 809612 with error too many values to unpack (expected 2)load index 409555 with error too many values to unpack (expected 2) load index 711016 with error too many values to unpack (expected 2)prepare index 361880 with error too many values to unpack (expected 2)prepare index 333686 with error too many values to unpack (expected 2)load index 460071 with error too many values to unpack (expected 2) load index 7897 with error too many values to unpack (expected 2)

prepare index 665514 with error too many values to unpack (expected 2)load index 64251 with error too many values to unpack (expected 2)prepare index 622545 with error too many values to unpack (expected 2)

prepare index 470651 with error too many values to unpack (expected 2) load index 177569 with error too many values to unpack (expected 2)

prepare index 305736 with error too many values to unpack (expected 2) prepare index 750848 with error too many values to unpack (expected 2)

prepare index 272870 with error too many values to unpack (expected 2)

load index 484761 with error too many values to unpack (expected 2) prepare index 409556 with error too many values to unpack (expected 2)load index 333686 with error too many values to unpack (expected 2) load index 809612 with error too many values to unpack (expected 2)prepare index 24017 with error too many values to unpack (expected 2)prepare index 240571 with error too many values to unpack (expected 2)load index 665514 with error too many values to unpack (expected 2)prepare index 711017 with error too many values to unpack (expected 2)load index 361880 with error too many values to unpack (expected 2)prepare index 64252 with error too many values to unpack (expected 2)load index 470651 with error too many values to unpack (expected 2) prepare index 460072 with error too many values to unpack (expected 2)

load index 272870 with error too many values to unpack (expected 2)load index 305736 with error too many values to unpack (expected 2)

prepare index 7898 with error too many values to unpack (expected 2)

load index 750848 with error too many values to unpack (expected 2)prepare index 177570 with error too many values to unpack (expected 2)

manjaro-git commented 2 years ago

I have found that the type of data.lmdb provided by docs or generated by tools/data/textdet/synthtext_converter is not supported by the LmdbAnnFileBackend which used in the official configs. And according to the docs, mmocr provides a new file called reg2lmbd , but it doesn't support the label with .mat file type, as synthtext has.So how can i transfer the gt.mat of synthtext dataset to the right file???That is very confusing!

manjaro-git commented 2 years ago

75 line filename, text = line.strip('/n').split(' ') in LmdbAnnFileBackend is the cause of error.I think the format of data.lmdb doesn't match the code.And i wanna kown what the text is ? what is the format of text?

gaotongxiao commented 2 years ago

Thanks for reporting the bug. It should have been fixed in #1159 and lmdb can be loaded with the provided config.