Closed iGV-dev closed 4 years ago
Hi there!
I would recommend using the out-of-the-box pipeline for this text. You might want to train a custom recognizer to include some of the symbols that appear to be common here (e.g., "~").
For the logos, I don't think an OCR pipeline will be a great fit because of how the characters are offset from each other. But I could be wrong! Instead, I might recommend building an object detector. I wrote another package, called mira that provides some abstractions for that if you're interested. But there are lots of other options out there.
Hi there!
I would recommend using the out-of-the-box pipeline for this text. You might want to train a custom recognizer to include some of the symbols that appear to be common here (e.g., "~").
For the logos, I don't think an OCR pipeline will be a great fit because of how the characters are offset from each other. But I could be wrong! Instead, I might recommend building an object detector. I wrote another package, called mira that provides some abstractions for that if you're interested. But there are lots of other options out there.
Thanks a lot for the reply and suggestions: much appreciated!!! Since I'm new to DNN and CV, I've started to study and self-teach the subject; also, to get my feet wet, I've started some hands-on tests based on the verbatim code from keras_ocr docs: https://keras-ocr.readthedocs.io/en/latest/examples/using_pretrained_models.html The results are very promising: the detection and recognition performance are superior to tools like tesseract, or opencv based approaches... I'll work on a custom recognizer using specific training images (following: https://keras-ocr.readthedocs.io/en/latest/examples/end_to_end_training.html#train-the-recognizer but it's still advanced material for me at the moment, so it'll take a while I guess). I've also looked at mira for the object detection ("logos/marks"), and it's indeed another great tool I'll be trying. If you have any advice that could help in the learning process, please feel free to let me know. All in all, I want to thank you for the great work and support. Ciao Giovanni
Hi Fausto; as per your suggestion I'm trying to train a custom recognizer for this alphabet = string.digits + string.asciiletters + '!?.^&*()[]{}|;:/\<>+-=~"\' ' I'm following along "Complete end-to-end training" (https://keras-ocr.readthedocs.io/en/latest/examples/end_to_end_training.html); the first thing I did was to generate synthetic data using <scripts/create_fonts_and_backgrounds.py>, which generated the background.zip and fonts.zip folders; I've then run the sample code in the end-to-end training example (I've attached the actual code I'm running as txt file, as py extensions are not supported here): when I run the 'recognizer_00' file I get this error [related to the detector csv file called by keras\callbacks.py", line 2323, in on_train_begin **self._open_args]:
Looking for C:\Users\absol\.keras-ocr\craft_mlt_25k.h5 WARNING:tensorflow:From C:\Python\lib\site-packages\tensorflow\python\keras\backend.py:5871: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a
tf.sparse.SparseTensorand use
tf.sparse.to_denseinstead. Provided alphabet does not match pretrained alphabet. Using backbone weights only. Looking for C:\Users\absol\.keras-ocr\crnn_kurapan_notop.h5 WARNING:tensorflow:From G:\Coding_WS\OCR_ML_DL_CV\Tests\Keras_OCR\CustomRecogn\recognizer_00.py:101: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: Please use Model.fit, which supports generators. C:\Python\lib\site-packages\keras_ocr\tools.py:505: RuntimeWarning: invalid value encountered in float_scalars rotation = np.arctan((tl[0] - bl[0]) / (tl[1] - bl[1])) Traceback (most recent call last): File "G:\Coding_WS\OCR_ML_DL_CV\Tests\Keras_OCR\CustomRecogn\recognizer_00.py", line 101, in <module> validation_steps=math.ceil(len(background_splits[1]) / detector_batch_size) File "C:\Python\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "C:\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1479, in fit_generator initial_epoch=initial_epoch) File "C:\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 66, in _method_wrapper return method(self, *args, **kwargs) File "C:\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 830, in fit callbacks.on_train_begin() File "C:\Python\lib\site-packages\tensorflow\python\keras\callbacks.py", line 447, in on_train_begin callback.on_train_begin(logs) File "C:\Python\lib\site-packages\tensorflow\python\keras\callbacks.py", line 2323, in on_train_begin **self._open_args) OSError: [Errno 22] Invalid argument: '.\\detector_2020-06-02T12:15:47.706838.csv'
Do you know how this can be fixed? Thanks Regards Giovanni
try set just one image to detect and recognize.
This is a problem with the path you supplied to the callbacks. From your traceback, the actual error is Invalid argument: '.\\detector_2020-06-02T12:15:47.706838.csv'
.
It seems that this line does not yield a reasonable filepath on your platform.
recognizer_basepath = os.path.join(data_dir, f'recognizer_{datetime.datetime.now().isoformat()}')
I think if you change the data_dir
to something that doesn't offend the OS, you should be okay. Maybe change data_dir = '.'
to data_dir = 'my_data'
(though you'll have to make sure it exists using something like os.makedirs()
).
Closing due to inactivity.
Hi; this is not an issue but a question on custom pipelines: I apologize if this is not the place for such a matter; if there's a specific place to post please let me know. I'm trying to set up a keras_ocr workflow to recognize and extract text from electrical products label images (see attached to get an idea); I'm building the training data-set with several 100s images, I will tag/label them using VoTT, and will then follow the example on keras_ocr documentation. The goal is to recognize and extract the text contained and the safety approval marks ("logos"); since I'm fairly new to this, I'd like to ask questions/suggestions: