Holmeyoung / crnn-pytorch

Pytorch implementation of CRNN (CNN + RNN + CTCLoss) for all language OCR.
MIT License
377 stars 105 forks source link

lmdb.Error when create dataset #27

Closed deep-practice closed 5 years ago

deep-practice commented 5 years ago

Traceback (most recent call last): File "~/crnn-pytorch-master/tool/create_dataset.py", line 135, in createDataset(args.out, image_path_list, label_list) File "~/crnn-pytorch-master/tool/create_dataset.py", line 55, in createDataset env = lmdb.open(outputPath, map_size=1099511627776) lmdb.Error: ~/fake/lmdb: \ufffd\ufffd\ufffd\u033f\u057c\u4cbb\ufffd\u3863

Holmeyoung commented 5 years ago

Hi, what's the mode you use to create lmdb, file or folder? Did you do as readme ~~~

deep-practice commented 5 years ago

I use folder mode,and the dataset is loaded correctly. The code fails when run to the line env = lmdb.open(outputPath, map_size=1099511627776) After that,I check the lmdb output folder,there are 2 files:lock.mdb and data.mdb with length 8kb

Holmeyoung commented 5 years ago

Hi, did you use python3 to create? Please try to update the lmdb and set themap_size = 9000000000 to see if this will help

deep-practice commented 5 years ago

changing map_size to 9000000000 solves the problem,but why?

Holmeyoung commented 5 years ago

It defines the max storage space. Maybe your disk didn't have 1TB space

deep-practice commented 5 years ago

Thanks so much,I can run the trainning process now.Another question,if the sample in my dataset has variable space between characters,for example,"Never too old too learn ",is it ok to train a good model?

Holmeyoung commented 5 years ago

Yeah, it's OK.

deep-practice commented 5 years ago

Thanks so much.