Closed timprepscius closed 11 years ago
Actually, that's pretty much it, if you want to do interactive grounding :) Unfortunately there's no formal documentation on anything ATM, but the UserGrounder docstring should get you going. If you're sure the text segments are correctly detected, you can also supply the ground text as a string to grounding.TextGrounder .
As you may have noticed, grounding.Grounder.ground() calls files.ImageFile.set_ground . By default, that method sets the ground labeling only on the memory (it doesn't write it to disk). For that, you may want to try this:
grounder.ground(test_image, test_segments)
test_image.ground.write()
I hope that clears things up a bit
Hello. I do not understand how to ground a file. I have tried tweaking the code to create segments but cannot do so. I've tried to use the UserGrounding feature because I thought it would be easier. My code looks like this:
testpicture = ImageFile(r"\Users\user\Desktop\words.jpg") testclass, testsegment = ocr.ocr(testpicture,show_steps=True) Jake = UserGrounder() Jake.ground(testpicture,testsegment)
I continue to get this error:
compactness, classified_points, means = cv2.kmeans( data=ys, K=k, bestLabels=None, criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_MAX_ITER, 1, 10), attempts=2, flags=cv2.KMEANS_PP_CENTERS) error: ..\..\..\..\opencv\modules\core\src\matrix.cpp:2702: error: (-215) N >= K in function cv::kmeans
Do you have any advice how to go forward? I am a beginner at code but feel that I've been trying the right things.
@ jakeboydston That's interesting. I think you're on the right track - I'm not that sure ImageFile takes a path like that, but if it didn't complain it should be ok. Did the segmentation succeed? (you should see rectangles around your characters, IIRC).
Can you post the full stack trace and the words.jpg
file (or something equivalent that triggers it)?
Hi there,
@goncalopp thanks for sharing this project. I've been playing around with it and just ran into something that's not really clear to me.
Could you please explain what the arguments (passed to ContourSegmenter() and SimpleFeatureExtractor()) stand for?
segmenter= ContourSegmenter( blur_y=5, blur_x=5, block_size=11, c=10)
extractor= SimpleFeatureExtractor( feature_size=10, stretch=False )
Any help would be highly appreciated.
Hi JoeTurtle,
The pipeline architecture makes following the parameters harder than it should be, but it's quite simple, actually.
As you see on segmentation.py
:
class ContourSegmenter( FullSegmenter ):
def __init__(self, **args):
filters= create_default_filter_stack()
stack = [BlurProcessor(), RawContourSegmenter()] + filters + [SegmentOrderer()]
a segmenter is a pipeline composed of multiple steps.
blur_y=5, blur_x=5
are parameters for BlurProcessor
, which applies
a gaussian blur on the
image as a pre-processing step. The parameters basically define the size
of the gaussian kernel (informally, the "blur amount")
block_size=11, c=10
are parameters for RawContourSegmenter
, which
feeds it straight into cv2.adaptiveThreshold
,
so you can read the documentation there - but it's basically parameters
for controlling thresholding
feature_size
is simply the (square root of) the size of the feature
vector to use as input to
the classification algorithm. In other words, each potential character
is resized to a feature*feature sized (square) image before being fed
into the actual learning mechanisms.
stretch
controls whether the original character image is stretched or
cropped in order to
be square.
I hope that helps :) If you can contribute documentation or docstrings while going through the code, I'll gladly accept it
This is a really interesting project.
Is there a best way to create the grounding for an image file?
I did this:
But somehow I don't think I should have :-)
How did you create the groundings?