Open ghost opened 1 year ago
Hi,
one approach to detecting individual letters in an image would be to train a object detection model on a custom dataset of labeled images, where each letter is treated as a separate class. Once the model is trained, it can be used to detect the presence and location of each letter in an input image. The output of this step would be a sequence of detected letters. To match this sequence of letters to individual words, a dictionary lookup could be performed to identify possible word matches.
Hope it helps.
Text from where ? Everything, car number plates ? if it's something like plates you can train a model to detect plates then with each detection you also take a screenshot using the bounding box location on screen as reference so you only screenshot the car plate not the whole image, after that run whatever OCR algorithm you use on that image or bitmap to get your text.
@aezakmi99 you can use yolov7 for localizing objects, so yes if you trained it on digits or alphabets dataset. It will be detect them. You can then crop those digits for further processing.
@pauliustumas @faizan1234567 @StefanCiobanu1989 i want to detect text from car number plates. I made an yolov7 algo with high precision, and it works fine. Now I was wondering how to combine OCR with detection of plates, once when i run command for detecting, it also shows text from plates. I want it all to be one process, not first taking screenshots or smth. I just dont know how do to it.
Here is a pipe line you can try: digits detector->digits classifier->label-text-from-prediction
maybe you can train detector on a large digits dataset to detect and label digits. As for as YOLOv7, you can just use it to detect an object for instance a number plate. You have to read literature to form your solution. I hope it helps
@pauliustumas @faizan1234567 @StefanCiobanu1989 i want to detect text from car number plates. I made an yolov7 algo with high precision, and it works fine. Now I was wondering how to combine OCR with detection of plates, once when i run command for detecting, it also shows text from plates. I want it all to be one process, not first taking screenshots or smth. I just dont know how do to it.
uhm, it all can be done in the same process by passing the inferenced image or frame or image section if you want if you are doing it in real time. Having it done with CV alone might get tricky because the detections wont happen in the order they are written on the plate. For instance i have a XSD234 the detector will tell you it found all of the letters and numbers but not the order in which you have to put them, meaning you will have to find a way around that but i might be wrong. However if you use an ocr library and pass the image/frame then that runs it's own image to text conversion would be much easier. Give something like https://pypi.org/project/pytesseract/ a look .
I made yolo v7 model for detecting one class only. Also, i added ocr so when it detects the object, it crops detection from bounding boxes and it send that to ocr for text detection. But, when plotting the results, detected text is too large and it goes beyond image. I tried edditing the def plot_one_box function, but with no success.
Here is how my function looks now:
def plot_one_box(x, img, color=None, label=None, text=None, line_thickness=3): tl = linethickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1 # line/font thickness color = color or [random.randint(0, 255) for in range(3)] c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3])) cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA) if label: tf = max(tl - 1, 1) # font thickness t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0] c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3 cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA) if text: t_size = cv2.getTextSize(text, 0, fontScale=tl / 3, thickness=1)[0] c3 = c1[0], c1[1] - t_size[1] - 3 cv2.rectangle(img, c1, c3, color, -1, cv2.LINE_AA) # filled cv2.putText(img, text, (c1[0], c1[1] + t_size[1] + 2), 0, tl / 3, [255, 255, 255], thickness=1, lineType=cv2.LINE_AA)
Anyway, I want to print detected label above bounding box as it is now, and then the detected text, but when it reaches the width of the bounding box width, it goes into a new line with new filled rectangle. Can somebody help me.
Hi, everyone.
Is there a way of combining trained YOLOv7 model with OCR to recognize text automatically, in flow? And how?