Hi 👋
I am working on a project to train a model on Handwriting recognition.
I have a mix of IAM and Custom (in house) dataset. It contains Words and Sentences both (I think that is the issue)
I have tried parseq and crnn_vgg16_bn and both get different errors.
I updated Vocab.py and added space in the string. But I think that is probably not the correct way
I am interested in trying master, parseq, vistr_base.
Train set loaded in 20.9s (125801 samples in 1965 batches)
0%| | 0/1965 [00:06<?, ?it/s]
Traceback (most recent call last): | 0/1965 [00:00<?, ?it/s]
File "references/recognition/train_tensorflow.py", line 448, in <module>
main(args)
File "references/recognition/train_tensorflow.py", line 346, in main
fit_one_epoch(model, train_loader, batch_transforms, optimizer, args.amp)
File "references/recognition/train_tensorflow.py", line 95, in fit_one_epoch
train_loss = model(images, targets, training=True)["loss"]
File "C:\Users\Arsalan\anaconda3\envs\doctr_recognition_training\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "F:\GitHub\doctr\doctr\models\recognition\parseq\tensorflow.py", line 361, in call
mask = tf.logical_and(padding_mask, tf.expand_dims(tf.expand_dims(target_mask, axis=0), axis=0))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Exception encountered when calling layer "par_seq" (type PARSeq).
required broadcastable shapes [Op:LogicalAnd]
Call arguments received:
• x=tf.Tensor(shape=(64, 32, 128, 3), dtype=float32)
• target=["'cent.'", "'of'", "'6806065232088'", "'It'", "'Riverview Abbey F.H.'", "'frame'", "'751-East 75th St.'", "'you'", "'the'", "'12-17-18'", "'about'", "'3'", "'Sore her ankle, Monday the 10th tried to work and over extended - over used'", "'03/17/2005'", "'lifting weights in school class and injured back'", "'J'", "'Grand Junction'", "'2'", "'Wiscasset'", "'MD'", "'Pension'", "'130.00'", "'could'", '\'Laceration >2" Rt Hand with utility knife\'', "'294'", "'down'", "'Parties'", "'8'", "'by'", "'protest'", "'03-11-1983'", "'2778 Country Club Dr.'", "'2.2,5'", "'author'", "','", "'Jeffrey Datterer'", "'(303) 771-6858'", "'78'", "'OME FEE'", "'the'", "'JoAnn AniEmEKa'", "'Family Practice/Emergency Medicine'", "'Sconrad@eckman'", "'but'", "'17/08/19'", "'NE'", "'5,879'", "'of'", "'of R lumbar'", "'NC'", "'eerie'", "'and'", "'9340 59 st'", "'said'", "'I'", "'250,000'", "','", "'spencer GAllAGHer'", "'654-3442'", "'the'", "'PPS Enhanced Yield'", "'08-16-86'", "'12-20-18'", "'Sleep'"]
• return_model_output=False
• return_preds=False
• kwargs={'training': 'True'}
For CRNN
Train set loaded in 18.35s (125801 samples in 1965 batches)
0%| | 0/1965 [00:01<?, ?it/s]
File "references/recognition/train_tensorflow.py", line 448, in <module>
main(args)
File "references/recognition/train_tensorflow.py", line 346, in main
fit_one_epoch(model, train_loader, batch_transforms, optimizer, args.amp)
File "references/recognition/train_tensorflow.py", line 91, in fit_one_epoch
for images, targets in pbar:
File "C:\Users\Arsalan\anaconda3\envs\doctr_recognition_training\lib\site-packages\tqdm\std.py", line 1182, in __iter__
for obj in iterable:
File "F:\GitHub\doctr\doctr\datasets\loader.py", line 95, in __next__
samples = list(multithread_exec(self.dataset.__getitem__, indices, threads=self.num_workers))
File "F:\GitHub\doctr\doctr\utils\multithreading.py", line 49, in multithread_exec
results = map(lambda x: x, tp.map(func, seq)) # noqa: C417
File "C:\Users\Arsalan\anaconda3\envs\doctr_recognition_training\lib\multiprocessing\pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\Arsalan\anaconda3\envs\doctr_recognition_training\lib\multiprocessing\pool.py", line 771, in get
raise self._value
File "C:\Users\Arsalan\anaconda3\envs\doctr_recognition_training\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Users\Arsalan\anaconda3\envs\doctr_recognition_training\lib\multiprocessing\pool.py", line 48, in mapstar
return list(map(*args))
File "F:\GitHub\doctr\doctr\datasets\datasets\base.py", line 49, in __getitem__
img, target = self._read_sample(index)
File "F:\GitHub\doctr\doctr\datasets\datasets\tensorflow.py", line 37, in _read_sample
assert isinstance(target, str) or isinstance(
AssertionError: Target should be a string or a numpy array
WARNING:tensorflow:Detecting that an object or model or tf.train.Checkpoint is being deleted with unrestored values. See the following logs for the specific values in question. To silence these warnings, use `status.expect_partial()`. See https://www.tensorflow.org/api_docs/python/tf/train/Checkpoint#restorefor details about the status object returned by the restore function.
WARNING:tensorflow:Detecting that an object or model or tf.train.Checkpoint is being deleted with unrestored values. See the following logs for the specific values in question. To silence these warnings, use `status.expect_partial()`. See https://www.tensorflow.org/api_docs/python/tf/train/Checkpoint#restorefor details about the status object returned by the restore function.
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).layer_with_weights-26.kernel
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).layer_with_weights-26.kernel
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).layer_with_weights-26.bias
WARNING:tensorflow:Value in checkpoint could not be found in the restored object: (root).layer_with_weights-26.bias
Bug description
Hi 👋 I am working on a project to train a model on Handwriting recognition. I have a mix of IAM and Custom (in house) dataset. It contains Words and Sentences both (I think that is the issue)
I have tried
parseq
andcrnn_vgg16_bn
and both get different errors. I updated Vocab.py and added space in the string. But I think that is probably not the correct wayI am interested in trying master, parseq, vistr_base.
Code snippet to reproduce the bug
For CRNN
For parseq
Error traceback
For Parseq
For CRNN
Environment
Conda env
Deep Learning backend
is_tf_available: True is_torch_available: False