Closed anindya7 closed 4 years ago
Your code runs fine for me. The output of the SegLink model should be of shape (1, 5461, 31)
. Did you change the model?... TF version?
The model is untouched. Could you please post your python and tf versions? I shall replicate your environment.
Thank you for answering my question with another question... 3.7.5, 2.3.0 ;)
With Python 3 the model output shape is indeed (1, 5461, 31)
. I am also using tensorflow 2.3.0. All the confidences are lower than 0.1 which is unusual.
Can you provide the example image? It is quite usual that most of the image is none-text.
I tested that it is working for certain images and not for some others. From the attached images, it works for penny_drop.png but not for PANcardmasked.png
Is it an option to fine-tune the detector with annotated real world data?
Unfortunately annotated data is not available. It seems that the difference in the text between the two images is that the latter does not have a strict edge. The colour of the text 'bleeds' or 'diffuses' into the neighbouring pixels. I shall try preprocessing: normalizing and/or thresholding. That should accentuate the characters.
I noticed a similar issue with motion blur on webcam images. Adding Gaussian blur to the data augmentation should fix it, but it requires retraining the models.
You could also try to pad the images. Most of the text instances in the SynthText dataset are smaller.
Hi, I have ported the SL_end2end_predict.ipynb to a .py file that loads only user images and gets predictions from them.
I am getting an output tensor shape of (1, 5461,18) for a single image. These are the summarized values:
[[ 0.9778447 0.02215527 0.05647869 ... 0.39081636 0.2786811 0.7213189 ] [ 0.9876583 0.01234164 0.17384888 ... 0.3960406 0.4638915 0.53610843] [ 0.9857997 0.01420039 0.1859522 ... 0.35515028 0.5405665 0.4594335 ] ... [ 0.99148715 0.00851282 0.8232149 ... 0.02318294 0.9777579 0.02224212] [ 0.99127495 0.00872501 -0.4267429 ... 0.0168435 0.98193043 0.01806954] [ 0.98890346 0.01109659 -0.55322266 ... 0.02168869 0.97793293 0.02206702]] [[0.9778447 0.02215527] [0.9876583 0.01234164] [0.9857997 0.01420039] ... [0.99148715 0.00851282] [0.99127495 0.00872501] [0.98890346 0.01109659]] [[ 0.05647869 0.25386062 -0.7133617 0.8498816 0.20088424] [ 0.17384888 0.65766203 -0.7351028 0.9332367 0.14610018] [ 0.1859522 1.1206706 -0.6635226 0.9163737 0.17239925] ... [ 0.8232149 -1.1342822 -3.839357 -1.5365617 0.0083948 ] [-0.4267429 -0.5418013 -3.4509156 -1.672849 -0.05473372] [-0.55322266 -0.27144217 -0.29643777 -0.03384027 0.6844612 ]] [[0.9789397 0.02106032 0.9673921 ... 0.03583498 0.9666423 0.03335765] [0.98277885 0.01722118 0.9819172 ... 0.02573826 0.97589934 0.02410063] [0.97954214 0.0204578 0.97924906 ... 0.02678417 0.975162 0.02483795] ... [0.97945476 0.02054522 0.9785146 ... 0.01981203 0.97710925 0.02289074] [0.9828745 0.01712552 0.9792094 ... 0.01742494 0.97953975 0.02046019] [0.9779886 0.02201145 0.9780286 ... 0.02184003 0.9782518 0.0217482 ]] [[0.33596185 0.6640381 0.30375123 ... 0.39081636 0.2786811 0.7213189 ] [0.6212505 0.37874946 0.12344692 ... 0.3960406 0.4638915 0.53610843] [0.53112626 0.46887374 0.17611167 ... 0.35515028 0.5405665 0.4594335 ] ... [0.9766356 0.02336443 0.9767829 ... 0.02318294 0.9777579 0.02224212] [0.98007303 0.019927 0.97990566 ... 0.0168435 0.98193043 0.01806954] [0.9779488 0.02205116 0.9778461 ... 0.02168869 0.97793293 0.02206702]]
The issue is, in sl_utils.py:304confs = segment_labels[:,1]
Extracts [0.02215527 0.01234164 0.01420039 ... 0.00851282 0.00872501 0.01109659] which do not look like the confidence values. Is my model output incorrect because of the input image?My input is: `for img_path in glob.glob('./examples_images/*'): img = cv2.imread(img_path) images_orig.append(np.copy(img)) h, w = image_size
resized_img=() resized_img = cv2.resize(img, (w,h),resized_img, cv2.INTER_LINEAR) resized_img = resized_img[:, :, (2,1,0)] / 255 # BGR to RGB images.append(resized_img)
images = np.asarray(images)
preds = det_model.predict(images, batch_size=1, verbose=1) ` Attached my python file. sl_crnn.py.txt
Thank you once again for a very helpful repo. Would appreciate your kind help on this.