apsexton / bateman-ocr

Tools and experiments in the OCR of the Bateman Manuscripts
ISC License
0 stars 5 forks source link

Provide manual selection groups of neigbouring CCs #14

Open apsexton opened 8 years ago

apsexton commented 8 years ago

This feature is to support manual selection of groups of neighbouring CCs so that they can (in a future task) be labelled.

Efficient ground truth labelling of connected components requires minimal switches between keyboard and mouse use. CCs on the screen image need to be selected so that they can be labelled with the characters they represent. However, characters are often broken so that the character really appears as two or more connected components. Sometimes two or more different characters touch in the image, so that there is only a single CC for more than one character.

Implement a way to have a "current selection group": this should be a list of the selected CCs. All selected CCs should be drawn on the image in a different colour to distinguish them from the non-selected CCs. Selecting a CC or a group of CCs simply means adding them to the current selection group.

The following is a suggestion but alternative approaches that provide similar functionality can be considered:

Using the keyboard:

If the selection group is empty, using an arrow key will select a CC in a corresponding corner (right=top right, up=top left, left = bottom left, down = bottom down). If the selection group is not empty, then it selects the next possible CC in the corresponding direction. The selection group can be extended by using shift-arrow to add the next possible CC in the corresponding direction to the selection group. Control-arrow can reduce the selection by removing the CC currently in the selection group that is furthest in the corresponding direction.

Using the mouse:

dragging should select those CCs that intersect the dragged out area. Clicking or dragging replaces the existing selection with the new selection. Shift clicking or shift dragging should add the new CCs to the existing selection. Control clicking or control dragging should toggle membership of the newly selected items in the existing selection (adds them if they were not already in it, removes them if they were in it).

kevinshen100 commented 8 years ago

How would I distinguish between drawing a rectangle and selecting CC's? Or would both occur at the same time on mouse drag?

kevinshen100 commented 8 years ago

Also, it seems that the large grid in sample.tif page 1 itself is a connected component. Dragging anywhere inside the box will automatically select this component. How do you think I should go about this? Is it supposed to happen, or do you want the program to ignore connected components above a certain size?

apsexton commented 8 years ago

Yes. For the moment ignore CCs that are big in both horizontal AND vertical directions. This should exclude grids but still leave large braces. The proper solution would require analysing the actual image of the large CCs to determine if it is a table and deal with it accordingly. However, that is a refinement that we can deal with later.

kevinshen100 commented 8 years ago

Hi, For the keyboard selection, say I have a current selection group like screenshot-ground truth engine. If I were to press the up arrow, how should I determine what connected component to select? Would it select "ECIAL" or would it select only one letter?

apsexton commented 8 years ago

It should only select one CC. Any of the E, C, I, A or L CCs are possible. The algorithm to choose it should be as predictable as possible: e.g. of all the possible next CCs, choose the leftmost for up-arrow or down-arrow, the topmost for left or right arrow.

kevinshen100 commented 8 years ago

Hi, I have another question. Say my current selection looks like: screenshot-ground truth engine-1

If I pressed up this time, would you prefer it selecting something in the middle or something at the top?

apsexton commented 8 years ago

take the rectangle union of all of your selection: this gives the tightest possible rectangle that encloses all the selected CCs. Now the various arrows should work with respect to this enclosing rectangle.

In practice, one would never normally use such a selection, but even if one did, the arrows are really just short cuts to selected a suitable next selection. The idea is that "next" is something reasonable and predictable - it doesn't have to be perfect. If it is good enough it can save considerable time in ground-truthing a document, but one must expect there to be ambiguities as to what should be next and allow the user to over ride it from time to time, by, for example, switching to the mouse or using a sequence of arrow keys.