Closed michath closed 8 years ago
Hi,
On 08/16/2016 04:58 PM, michath wrote:
hi, running tools/segmentation.py creates /tmp/base_chars.png:
- i was not able to determine where exactly in code it is created
src/Python/pyFastText.cpp : get_char_segmentations
- i guess it contains the resulted segmentation, which is bad
the "chars" visualization is rubbish:
- works just on simple images : there are all segmenations from all scales, bigger segmentations hides smaller etc.
- i wonder if the bad result is due to an unfit cvBoostChar.xml
probably no, you can try just draw the bounding boxes of the segmentations (first 4 columns in segmentation array).
- which API should i use to recreate cvBoostChar.xml myself?
no API provided.
All the best, Michal
- any code example for such call? thanks!!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MichalBusta/FASText/issues/10, or mute the thread https://github.com/notifications/unsubscribe-auth/AD6jsA6wEZLg18jvGtetkc0t-PHmAz5Sks5qgc_5gaJpZM4JlgNv.
thanks, but - array (char) segmentations has ~2000 rows per source image given in figure 11 (top row) of the paper. how can this number be narrowed down to produce the resulting text in figure 11 (bottom row)? [i guess this large number explains /tmp/base_chars.png] thanks again :)
On 08/17/2016 10:26 AM, michath wrote:
thanks, but - array (char) segmentations has ~2000 rows per source image given in figure 11 (top row) of the paper.
correct, see Table 2 for expected numbers. the FASText key-points is very low preprocessing stage, it do the job well: sample the characters segmentation -> low cost approximation of CSER
how can this number be narrowed down to produce the resulting text in figure 11 (bottom row)? [i guess this large number explains /tmp/base_chars.png]
text clustering and classification.
- now-days the best you can do is to feed results of text-line proposals to some powerfull classifier (such as one from VGG: http://www.robots.ox.ac.uk/~vgg/research/text/)
thanks again :)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MichalBusta/FASText/issues/10#issuecomment-240346582, or mute the thread https://github.com/notifications/unsubscribe-auth/AD6jsDukOyrhpdCl5vgbwCZ0wLeCf6h0ks5qgsWxgaJpZM4JlgNv.
thanks, but:
thanks a lot!
On 08/17/2016 04:50 PM, michath wrote:
thanks, but:
- where in code do i get the features for such classifier?
CharClassifier.cpp -> extractCharFeatures
- the paper mentions four features, but cvBoostChar.xml has six.
yes, the code evolved since publication deadline and release time. If desired, I can publish change list. (there are also small changes in pattern cheking, etc ... )
- where in code do i get the output of AdaBoost? is it applied? thanks a lot!
the output of classifier is in segmentations array: see quality -> returns the np array where row is: [bbox.x, bbox.y, bbox.width, bbox.height, keyPoint.pt.x, keyPoint.pt.y, octave, ?, duplicate, quality, [keypointsIds]]
there is no filtering of the segmentations, you can just query for quality: segmentations[:, 9] > q
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MichalBusta/FASText/issues/10#issuecomment-240436685, or mute the thread https://github.com/notifications/unsubscribe-auth/AD6jsABy6kWBGTQI-Lsbyio08GqPNNtEks5qgx_SgaJpZM4JlgNv.
hi, running tools/segmentation.py creates /tmp/base_chars.png:
thanks!!