Closed Aldemaro14 closed 4 years ago
Also i did followed this advice and got another issue
(venv) C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark>python create_lmdb_dataset.py --inputPath ../data/training_data --gtFile ../data/training_data/gt.txt --outputPath result/ Traceback (most recent call last): File "create_lmdb_dataset.py", line 89, in <module> fire.Fire(createDataset) File "C:\Users\itres\Desktop\OCR\craft_crnn\venv\lib\site-packages\fire\core.py", line 138, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "C:\Users\itres\Desktop\OCR\craft_crnn\venv\lib\site-packages\fire\core.py", line 463, in _Fire component, remaining_args = _CallAndUpdateTrace( File "C:\Users\itres\Desktop\OCR\craft_crnn\venv\lib\site-packages\fire\core.py", line 672, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "create_lmdb_dataset.py", line 49, in createDataset imagePath, label = datalist[i].strip('\n').split('\t', 1) ValueError: not enough values to unpack (expected 2, got 1)
it worked properly with space just by changing: from this imagePath, label = datalist[i].strip('\n').split('\t') to this imagePath, label = datalist[i].strip('\n').split(' ', 1)
Fixed.
1st- you can train the model using PNG+space instead of PNG+tab, just use the code abobe.
2nd- the issue was that I was pointing to the wrong folder.....
Hello good people, I'm trying to train this model with my own data, after some issues, now i got the following when running the following:
`(venv) C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark>python train.py --train_data ../result --valid_data ../result_val --Transformation None --FeatureExtraction VGG --SequenceModeling BiLSTM --Prediction CTC Filtering the images containing characters which are not in opt.character Filtering the images whose label is longer than opt.batch_max_length
dataset_root: ../result opt.select_data: ['/'] opt.batch_ratio: ['1']
dataset_root: ../result dataset: / None Traceback (most recent call last): File "train.py", line 304, in
train(opt)
File "train.py", line 31, in train
train_dataset = Batch_Balanced_Dataset(opt)
File "C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark\dataset.py", line 42, in init
_dataset, _dataset_log = hierarchical_dataset(root=opt.train_data, opt=opt, select_data=[selected_d])
File "C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark\dataset.py", line 118, in hierarchical_dataset
dataset = LmdbDataset(dirpath, opt)
File "C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark\dataset.py", line 143, in init
nSamples = int(txn.get('num-samples'.encode()))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'`
also i was able to generate the lmdb dataset