The function is being used in the imagetranslation.py file like this
...
from src.modules.coreimage import calc_max_gap_dist
...
def translate ():
...
MAXIMUM_GAP_DIST = calc_max_gap_dist(image)
...
for msection in merged_sections:
if should_merge_sections(
msection, section, MAXIMUM_GAP_DIST, image
):
...
In short, the calc_max_gap_dist is basically calculating how much maximum gap two text section should have when we consider merging them. If two text section has gap greater than this amount, we don't merge them and consider them seperate. If it is lower than this value, then we will consider merging them, as per the algorithm:
def should_merge_sections():
...
if parent_section.overlaps(child_section):
return True
elif parent_section.y_difference(child_section) < MAXIMUM_GAP_DIST:
if parent_section.in_x_axis_range(child_section):
return True
elif parent_section.x_difference(child_section) < MAXIMUM_GAP_DIST:
if parent_section.in_y_axis_range(child_section):
return True
return False
However, we see that our current calc_max_gap_dist uses 7.5 pixel as a max gap threshold for an image with 1000 pixels in height. We need to determine if this is really ideal and if it isn't ideal, how should we be calculating this value since we use this value to determine if two section must be merged or not.
Consider the function
calc_max_gap_dist
in thecoreimage.py
file.The function is being used in the
imagetranslation.py
file like thisIn short, the
calc_max_gap_dist
is basically calculating how much maximum gap two text section should have when we consider merging them. If two text section has gap greater than this amount, we don't merge them and consider them seperate. If it is lower than this value, then we will consider merging them, as per the algorithm:However, we see that our current
calc_max_gap_dist
uses7.5
pixel as a max gap threshold for an image with1000
pixels in height. We need to determine if this is really ideal and if it isn't ideal, how should we be calculating this value since we use this value to determine if two section must be merged or not.