liuliu / ccv

C-based/Cached/Core Computer Vision Library, A Modern Computer Vision Library
http://libccv.org
Other
7.08k stars 1.72k forks source link

ccv_swt_default_params fields: any docs? I need to trim word detection... #190

Open marcolino opened 8 years ago

marcolino commented 8 years ago

Calling swtdetect (which uses ccv_swt_detect_words to detect words) on a scanned image from a book, and postprocessing result to red box each "word" detected, with default params I get this result:

lenin14img_0002-boxed

Input is a 70 MB TIFF image (@ 600 x 600 dpi, 6992 x 4920 pixels).

As you can see it almost perfect, but it has some true negatives and some false positives.

Since my goal is to detect page external "margins"(top, right, bottom, left areas with no text), the result is spoiled by the two bottom false positives.

Is there any documentation about ccv_swt_default_params fields, to avoid these false postives (and possibly decrease true negatives, too)?

const ccv_swt_param_t ccv_swt_default_params = {
    .interval = 1,
    .same_word_thresh = { 0.1, 0.8 },
    .min_neighbors = 1,
    .scale_invariant = 0,
    .size = 3,
    .low_thresh = 124,
    .high_thresh = 204,
    .max_height = 300,
    .min_height = 8,
    .min_area = 38,
    .letter_occlude_thresh = 3,
    .aspect_ratio = 8,
    .std_ratio = 0.83,
    .thickness_ratio = 1.5,
    .height_ratio = 1.7,
    .intensity_thresh = 31,
    .distance_ratio = 2.9,
    .intersect_ratio = 1.3,
    .letter_thresh = 3,
    .elongate_ratio = 1.9,
    .breakdown = 1,
    .breakdown_ratio = 1.0,
};