mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
721 stars 130 forks source link

which model to use for segmentation [blla]? #177

Closed Shreeshrii closed 4 years ago

Shreeshrii commented 4 years ago

kraken, version 3.0.0.0b4.dev9

Since blla.model is not available, I tested using offsplit.mlmodel and cbad.mlmodel that have been referred to in other posts, with an image with Devanagari script.

p0015.png
time -p kraken -i p0015.png p0015-offsplit.json binarize segment -i /home/ubuntu/kraken/deva/models/offsplit.mlmodel
[0.0030] Baseline model (/home/ubuntu/kraken/deva/models/offsplit.mlmodel) given but legacy segmenter selected. Forcing to -bl.
Binarizing      ✓
WARNING: Logging before flag parsing goes to stderr.
W0222 04:36:19.087795 137579673896576 __init__.py:74] TensorFlow version 1.15.0 detected. Last version known to be fully compatible is 1.14.0 .
Segmenting      ✓
real 32.20
user 31.84
sys 0.54

time -p kraken -i p0015.png p0015-cbad.json binarize segment -i /home/ubuntu/kraken/deva/models/cbad.mlmodel
[0.0031] Baseline model (/home/ubuntu/kraken/deva/models/cbad.mlmodel) given but legacy segmenter selected. Forcing to -bl.
Binarizing      ✓
WARNING: Logging before flag parsing goes to stderr.
W0222 04:34:34.412533 132354489861760 __init__.py:74] TensorFlow version 1.15.0 detected. Last version known to be fully compatible is 1.14.0 .
Segmenting      ✓
real 32.32
user 31.99
sys 0.61

Both models create the same output with above command.

{"text_direction": "horizontal-lr", "boxes": [[865, 269, 1341, 358], [231, 397, 1200, 489], [233, 511, 1209, 591], [233, 656, 1240, 749], [233, 774, 1289, 866], [233, 914, 1251, 1007], [233, 1027, 1319, 1120], [234, 1171, 1241, 1268], [235, 1288, 1282, 1378], [234, 1430, 1197, 1522], [233, 1543, 1274, 1635], [233, 1687, 1300, 1779], [233, 1801, 1226, 1893], [233, 1945, 1307, 2037], [233, 2059, 1258, 2151], [233, 2172, 1187, 2264], [233, 2316, 1205, 2409], [232, 2430, 1316, 2527], [231, 2543, 1249, 2636], [234, 2688, 1328, 2780], [233, 2802, 1294, 2882], [235, 2946, 1294, 3044], [233, 3063, 1940, 3153]], "script_detection": false}

However, when I add --bl to the above commands, offsplit.mlmodel gives a different output while cbad.mlmodel gives an error.

{"text_direction": "horizontal-lr", "type": "baselines", "lines": [{"script": "default", "baseline": [[852, 326], [1222, 323], [1364, 332]], "boundary": [[852, 326], [858, 268], [892, 245], [976, 265], [1124, 265], [1150, 245], [1257, 265], [1317, 245], [1358, 280], [1364, 332], [1326, 349], [1167, 346], [1112, 392], [855, 392]]}, {"script": "default", "baseline": [[248, 343], [352, 337]], "boundary": [[248, 343], [248, 280], [343, 291], [352, 337], [335, 352], [254, 352]]}, {"script": "default", "baseline": [[1248, 387], [2002, 378]], "boundary": [[1248, 387], [1251, 349], [1583, 340], [1962, 346], [2002, 378], [1965, 401], [1892, 404], [1251, 398]]}, {"script": "default", "baseline": [[228, 447], [627, 462], [1231, 465]], "boundary": [[228, 447], [234, 398], [346, 398], [364, 387], [398, 398], [436, 384], [520, 398], [658, 395], [682, 413], [734, 387], [777, 398], [1063, 398], [1095, 387], [1176, 401], [1231, 465], [1196, 488], [1167, 473], [1138, 496], [1072, 482], [1054, 493], [1025, 476], [959, 488], [933, 476], [910, 488], [892, 473], [878, 485], [823, 476], [734, 488], [702, 465], [679, 485], [635, 493], [546, 470], [528, 488], [502, 473], [471, 488], [404, 488], [387, 476], [294, 499], [231, 465]]}, {"script": "default", "baseline": [[219, 563], [445, 577], [1222, 577]], "boundary": [[219, 563], [225, 514], [300, 496], [424, 511], [453, 493], [476, 511], [549, 508], [586, 531], [650, 491], [682, 511], [869, 508], [913, 491], [936, 508], [1028, 514], [1075, 491], [1147, 534], [1170, 511], [1213, 514], [1222, 577], [1205, 603], [1173, 603], [1150, 586], [1129, 600], [1083, 589], [1057, 603], [1037, 589], [1020, 603], [997, 592], [971, 603], [950, 589], [881, 606], [849, 592], [832, 606], [774, 603], [757, 589], [736, 603], [690, 592], [638, 600], [592, 580], [543, 609], [520, 594], [436, 603], [419, 592], [387, 606], [369, 592], [341, 603], [288, 589], [245, 606], [222, 583]]}, {"script": "default", "baseline": [[1785, 566], [1968, 583]], "boundary": [[1785, 566], [1794, 528], [1820, 505], [1890, 517], [1921, 508], [1962, 540], [1968, 583], [1942, 609], [1916, 597], [1884, 606], [1863, 592], [1826, 609], [1791, 571]]}, {"script": "default", "baseline": [[236, 716], [682, 722], [887, 716], [1109, 722], [1239, 716]], "boundary": [[236, 716], [239, 655], [294, 655], [317, 635], [369, 655], [413, 635], [447, 655], [473, 638], [525, 655], [551, 632], [601, 655], [716, 655], [736, 672], [791, 638], [829, 655], [1173, 655], [1196, 675], [1231, 655], [1239, 716], [1231, 748], [1207, 730], [1181, 745], [1138, 730], [1095, 748], [1025, 742], [1014, 730], [994, 745], [971, 730], [953, 745], [930, 733], [907, 748], [892, 736], [777, 745], [754, 724], [728, 745], [679, 730], [661, 745], [589, 742], [525, 759], [499, 736], [407, 745], [384, 730], [294, 765], [265, 742], [239, 742]]}, {"script": "default", "baseline": [[1817, 834], [1968, 828]], "boundary": [[1817, 834], [1826, 762], [1918, 762], [1959, 791], [1968, 828], [1944, 863], [1910, 854], [1881, 866], [1863, 849], [1846, 863], [1823, 860]]}, {"script": "default", "baseline": [[239, 837], [1161, 834], [1291, 846]], "boundary": [[239, 837], [245, 768], [369, 768], [398, 791], [473, 750], [497, 768], [575, 756], [589, 771], [684, 771], [713, 794], [771, 753], [809, 771], [1274, 765], [1286, 768], [1291, 846], [1233, 880], [1202, 866], [1161, 880], [1095, 843], [1077, 857], [1031, 857], [1005, 840], [988, 857], [945, 846], [898, 860], [875, 883], [826, 872], [809, 854], [736, 854], [722, 840], [699, 860], [606, 860], [595, 849], [580, 860], [499, 849], [447, 857], [407, 837], [384, 857], [335, 846], [317, 857], [242, 852]]}, {"script": "default", "baseline": [[231, 973], [971, 973], [1265, 982]], "boundary": [[231, 973], [236, 912], [375, 912], [410, 892], [450, 915], [485, 915], [523, 892], [551, 915], [577, 895], [621, 915], [1176, 909], [1213, 935], [1248, 912], [1260, 921], [1265, 982], [1245, 1005], [1216, 987], [1179, 1010], [1147, 993], [1115, 1022], [1034, 993], [1008, 1005], [953, 990], [907, 1010], [890, 996], [843, 1005], [823, 990], [803, 1002], [783, 987], [739, 1002], [713, 982], [690, 1002], [635, 987], [615, 1005], [485, 1005], [465, 990], [445, 1005], [372, 1002], [352, 987], [320, 1008], [286, 993], [234, 996]]}, {"script": "default", "baseline": [[1791, 1097], [1843, 1088], [1962, 1088]], "boundary": [[1791, 1097], [1794, 1048], [1814, 1025], [1936, 1025], [1953, 1036], [1962, 1088], [1924, 1120], [1890, 1123], [1863, 1103], [1840, 1126], [1820, 1123], [1794, 1097]]}, {"script": "default", "baseline": [[219, 1094], [427, 1088], [945, 1088], [1271, 1097], [1355, 1088]], "boundary": [[219, 1094], [222, 1028], [260, 1008], [332, 1025], [369, 1005], [410, 1025], [430, 1005], [453, 1005], [491, 1028], [525, 1008], [563, 1025], [667, 1025], [687, 1005], [716, 1002], [754, 1025], [780, 1008], [817, 1025], [924, 1025], [950, 1002], [991, 1005], [1020, 1025], [1075, 1028], [1095, 1045], [1118, 1025], [1309, 1025], [1346, 1060], [1355, 1088], [1277, 1135], [1228, 1117], [1144, 1135], [1106, 1106], [1057, 1117], [1037, 1103], [1017, 1117], [950, 1117], [927, 1100], [898, 1117], [875, 1103], [780, 1117], [745, 1097], [728, 1114], [684, 1114], [664, 1100], [641, 1117], [618, 1103], [523, 1117], [491, 1094], [465, 1117], [419, 1103], [398, 1120], [260, 1117], [225, 1094]]}, {"script": "default", "baseline": [[1783, 1351], [1950, 1351]], "boundary": [[1783, 1351], [1788, 1314], [1820, 1282], [1918, 1285], [1944, 1305], [1942, 1380], [1918, 1366], [1887, 1380], [1863, 1363], [1832, 1383], [1788, 1351]]}, {"script": "default", "baseline": [[1803, 1605], [1861, 1611], [1970, 1603]], "boundary": [[1803, 1605], [1806, 1548], [1823, 1536], [1890, 1548], [1927, 1539], [1962, 1574], [1970, 1603], [1947, 1643], [1884, 1637], [1861, 1620], [1829, 1637], [1806, 1620]]}, {"script": "default", "baseline": [[1780, 2587], [1985, 2587]], "boundary": [[1780, 2587], [1817, 2538], [1918, 2538], [1985, 2587], [1942, 2639], [1916, 2628], [1887, 2639], [1863, 2622], [1835, 2642], [1785, 2599]]}, {"script": "default", "baseline": [[216, 3116], [352, 3125], [1306, 3122], [1447, 3128]], "boundary": [[216, 3116], [219, 3052], [242, 3038], [286, 3058], [485, 3061], [687, 3058], [716, 3035], [739, 3035], [762, 3055], [783, 3038], [812, 3038], [878, 3076], [898, 3058], [1144, 3058], [1216, 3035], [1260, 3058], [1424, 3058], [1439, 3073], [1447, 3128], [1424, 3156], [1372, 3145], [1138, 3156], [1112, 3148], [1060, 3177], [1008, 3154], [916, 3156], [892, 3139], [858, 3159], [696, 3145], [595, 3159], [511, 3142], [494, 3154], [447, 3156], [427, 3145], [291, 3159], [262, 3145], [242, 3159], [219, 3142]]}, {"script": "default", "baseline": [[1817, 1865], [1907, 1860], [1968, 1862]], "boundary": [[1817, 1865], [1820, 1793], [1939, 1805], [1959, 1822], [1968, 1862], [1942, 1886], [1890, 1897], [1863, 1877], [1823, 1894]]}, {"script": "default", "baseline": [[222, 2498], [445, 2486], [788, 2489], [1066, 2483], [1176, 2489], [1335, 2489]], "boundary": [[222, 2498], [225, 2429], [283, 2437], [346, 2405], [387, 2426], [421, 2405], [459, 2431], [569, 2429], [586, 2411], [635, 2431], [658, 2411], [684, 2411], [705, 2429], [748, 2411], [783, 2429], [884, 2429], [916, 2405], [965, 2429], [1002, 2408], [1034, 2420], [1080, 2408], [1112, 2431], [1135, 2411], [1158, 2411], [1181, 2429], [1312, 2429], [1335, 2489], [1288, 2538], [1242, 2512], [1184, 2521], [1161, 2504], [1141, 2521], [1115, 2509], [1051, 2524], [1034, 2509], [1014, 2521], [988, 2509], [971, 2521], [936, 2509], [881, 2518], [864, 2501], [826, 2518], [797, 2518], [777, 2501], [760, 2518], [693, 2527], [641, 2509], [624, 2521], [577, 2524], [557, 2509], [540, 2524], [499, 2521], [465, 2498], [445, 2518], [393, 2507], [326, 2521], [306, 2507], [265, 2541], [228, 2530]]}, {"script": "default", "baseline": [[1785, 3119], [1962, 3119]], "boundary": [[1785, 3119], [1791, 3084], [1823, 3055], [1927, 3055], [1956, 3078], [1956, 3145], [1936, 3159], [1861, 3145], [1829, 3159], [1791, 3125]]}, {"script": "default", "baseline": [[236, 2746], [693, 2752], [829, 2746], [1213, 2743], [1338, 2755]], "boundary": [[236, 2746], [242, 2686], [364, 2689], [384, 2668], [407, 2668], [436, 2691], [734, 2691], [852, 2686], [872, 2668], [916, 2686], [1060, 2689], [1132, 2665], [1161, 2686], [1257, 2686], [1288, 2709], [1317, 2683], [1332, 2691], [1338, 2755], [1326, 2778], [1294, 2764], [1236, 2775], [1219, 2758], [1190, 2778], [1150, 2758], [1121, 2775], [1086, 2775], [1054, 2758], [1037, 2775], [939, 2775], [924, 2764], [913, 2775], [797, 2767], [786, 2778], [745, 2778], [728, 2764], [713, 2778], [615, 2775], [586, 2798], [508, 2767], [459, 2778], [442, 2761], [421, 2778], [384, 2778], [369, 2764], [355, 2778], [291, 2764], [277, 2775], [239, 2769]]}, {"script": "default", "baseline": [[216, 1487], [315, 1496], [499, 1490], [1109, 1490], [1164, 1499]], "boundary": [[216, 1487], [222, 1432], [239, 1426], [433, 1426], [473, 1406], [497, 1426], [690, 1435], [745, 1429], [765, 1412], [838, 1429], [875, 1406], [910, 1409], [933, 1426], [1132, 1429], [1158, 1449], [1164, 1499], [1127, 1519], [1095, 1504], [1080, 1516], [1051, 1504], [968, 1519], [780, 1516], [731, 1539], [690, 1507], [676, 1519], [632, 1510], [609, 1527], [508, 1516], [488, 1499], [459, 1519], [413, 1501], [398, 1516], [355, 1519], [338, 1507], [317, 1519], [291, 1504], [268, 1519], [245, 1516], [219, 1490]]}, {"script": "default", "baseline": [[213, 1868], [338, 1862], [1184, 1862], [1265, 1868]], "boundary": [[213, 1868], [216, 1805], [239, 1796], [387, 1799], [407, 1779], [430, 1779], [456, 1802], [586, 1799], [621, 1782], [696, 1819], [716, 1799], [968, 1799], [991, 1779], [1051, 1799], [1077, 1776], [1161, 1822], [1187, 1799], [1219, 1799], [1257, 1836], [1265, 1868], [1210, 1894], [1170, 1877], [1115, 1888], [1086, 1909], [1037, 1888], [962, 1888], [947, 1877], [933, 1888], [864, 1888], [849, 1877], [832, 1888], [742, 1888], [719, 1865], [690, 1888], [598, 1888], [583, 1877], [549, 1900], [462, 1871], [445, 1888], [361, 1888], [343, 1877], [309, 1900], [222, 1868]]}, {"script": "default", "baseline": [[1817, 2235], [1956, 2229]], "boundary": [[1817, 2235], [1820, 2169], [1872, 2169], [1904, 2186], [1927, 2169], [1947, 2174], [1947, 2255], [1878, 2267], [1863, 2252], [1849, 2267], [1823, 2264]]}, {"script": "default", "baseline": [[1794, 2850], [1849, 2868], [1944, 2850]], "boundary": [[1794, 2850], [1797, 2816], [1820, 2795], [1936, 2813], [1944, 2850], [1930, 2888], [1910, 2882], [1884, 2899], [1858, 2882], [1835, 2896], [1800, 2871]]}, {"script": "default", "baseline": [[210, 2605], [638, 2602], [999, 2605], [1124, 2611], [1265, 2605]], "boundary": [[210, 2605], [219, 2544], [312, 2541], [341, 2524], [361, 2544], [445, 2544], [471, 2524], [517, 2521], [540, 2535], [563, 2527], [583, 2544], [739, 2547], [820, 2541], [846, 2518], [887, 2541], [927, 2544], [1002, 2544], [1049, 2524], [1069, 2541], [1150, 2541], [1179, 2561], [1210, 2541], [1239, 2541], [1257, 2553], [1265, 2605], [1242, 2637], [1207, 2634], [1190, 2616], [1176, 2631], [1118, 2619], [1106, 2631], [1023, 2634], [1002, 2616], [968, 2631], [942, 2611], [921, 2631], [892, 2622], [875, 2634], [849, 2616], [832, 2631], [768, 2634], [719, 2619], [696, 2637], [676, 2622], [656, 2637], [598, 2634], [580, 2619], [566, 2634], [534, 2622], [511, 2637], [491, 2622], [456, 2654], [427, 2651], [367, 2611], [335, 2637], [315, 2619], [300, 2634], [260, 2634], [216, 2605]]}, {"script": "default", "baseline": [[231, 1343], [473, 1351], [976, 1348], [1158, 1357], [1317, 1351]], "boundary": [[231, 1343], [236, 1288], [271, 1267], [312, 1288], [606, 1288], [635, 1265], [722, 1305], [739, 1288], [861, 1285], [881, 1265], [904, 1265], [933, 1267], [953, 1288], [1193, 1285], [1219, 1308], [1242, 1285], [1280, 1285], [1312, 1317], [1317, 1351], [1277, 1377], [1245, 1377], [1222, 1360], [1129, 1395], [1034, 1363], [988, 1395], [959, 1395], [907, 1366], [757, 1377], [734, 1357], [708, 1377], [603, 1377], [586, 1395], [557, 1392], [508, 1360], [488, 1374], [416, 1360], [398, 1377], [369, 1363], [346, 1377], [234, 1369]]}, {"script": "default", "baseline": [[222, 2114], [447, 2125], [1271, 2125]], "boundary": [[222, 2114], [231, 2059], [410, 2056], [439, 2033], [476, 2056], [595, 2062], [641, 2042], [676, 2062], [1046, 2059], [1112, 2042], [1135, 2056], [1190, 2059], [1216, 2082], [1254, 2059], [1271, 2125], [1257, 2148], [1222, 2131], [1207, 2146], [1141, 2154], [1092, 2128], [1072, 2148], [1020, 2143], [976, 2169], [898, 2143], [835, 2146], [823, 2134], [777, 2148], [754, 2128], [722, 2148], [693, 2125], [664, 2151], [627, 2157], [479, 2137], [349, 2151], [286, 2137], [271, 2151], [245, 2151], [225, 2134]]}, {"script": "default", "baseline": [[210, 2238], [364, 2232], [1150, 2232], [1225, 2238]], "boundary": [[210, 2238], [213, 2180], [262, 2154], [309, 2172], [398, 2172], [421, 2148], [517, 2177], [566, 2151], [606, 2172], [783, 2172], [806, 2148], [858, 2172], [1098, 2172], [1121, 2192], [1144, 2172], [1181, 2172], [1216, 2206], [1225, 2238], [1207, 2238], [1181, 2264], [1150, 2264], [1129, 2247], [1109, 2261], [1049, 2261], [1011, 2241], [959, 2278], [861, 2250], [800, 2267], [768, 2252], [722, 2261], [705, 2250], [682, 2261], [664, 2247], [644, 2261], [557, 2261], [520, 2244], [497, 2261], [447, 2261], [427, 2247], [410, 2261], [271, 2264], [242, 2261], [219, 2238]]}, {"script": "default", "baseline": [[208, 3003], [468, 3003], [618, 3009], [1312, 3009]], "boundary": [[208, 3003], [213, 2966], [234, 2946], [465, 2948], [488, 2925], [514, 2925], [537, 2946], [673, 2943], [687, 2928], [725, 2946], [786, 2946], [829, 2925], [890, 2948], [913, 2925], [933, 2925], [982, 2928], [1002, 2946], [1069, 2943], [1083, 2928], [1118, 2946], [1291, 2946], [1306, 2957], [1312, 3009], [1274, 3052], [1216, 3026], [1196, 3035], [1173, 3021], [1153, 3035], [1115, 3015], [1031, 3058], [982, 3029], [939, 3038], [907, 3018], [887, 3035], [780, 3038], [719, 3012], [693, 3035], [627, 3035], [609, 3024], [580, 3032], [546, 3015], [502, 3058], [485, 3058], [439, 3029], [413, 3035], [401, 3024], [378, 3035], [329, 3026], [309, 3041], [274, 3038], [213, 3003]]}, {"script": "default", "baseline": [[222, 2873], [349, 2865], [719, 2871], [1326, 2871]], "boundary": [[222, 2873], [225, 2793], [239, 2778], [288, 2801], [384, 2801], [419, 2784], [439, 2801], [710, 2804], [843, 2801], [869, 2778], [913, 2801], [982, 2801], [1025, 2781], [1057, 2801], [1202, 2801], [1231, 2824], [1254, 2801], [1288, 2801], [1317, 2830], [1326, 2871], [1288, 2894], [1257, 2894], [1233, 2876], [1216, 2891], [1161, 2885], [1135, 2896], [1075, 2879], [1060, 2891], [1037, 2879], [962, 2891], [936, 2871], [901, 2891], [858, 2873], [829, 2891], [806, 2879], [788, 2894], [736, 2891], [719, 2876], [690, 2891], [673, 2876], [630, 2896], [595, 2882], [551, 2891], [534, 2876], [468, 2891], [442, 2868], [410, 2891], [225, 2882]]}, {"script": "default", "baseline": [[222, 1750], [1164, 1747], [1312, 1756]], "boundary": [[222, 1750], [225, 1689], [260, 1666], [309, 1689], [462, 1689], [485, 1672], [525, 1689], [569, 1689], [606, 1666], [647, 1689], [806, 1689], [864, 1686], [907, 1663], [933, 1686], [1231, 1686], [1260, 1709], [1294, 1683], [1312, 1756], [1294, 1779], [1265, 1761], [1248, 1776], [1213, 1776], [1193, 1758], [1173, 1776], [1095, 1776], [1075, 1764], [1051, 1776], [1031, 1758], [1011, 1776], [930, 1776], [913, 1764], [898, 1776], [875, 1764], [829, 1776], [809, 1758], [788, 1776], [731, 1770], [710, 1782], [615, 1767], [583, 1796], [557, 1796], [525, 1773], [482, 1779], [465, 1764], [445, 1779], [395, 1764], [381, 1779], [326, 1779], [309, 1767], [294, 1779], [245, 1779], [225, 1764]]}, {"script": "default", "baseline": [[213, 2377], [338, 2385], [1106, 2382], [1225, 2388]], "boundary": [[213, 2377], [216, 2322], [260, 2319], [283, 2302], [317, 2319], [482, 2319], [572, 2299], [592, 2316], [644, 2319], [696, 2293], [734, 2322], [832, 2296], [881, 2322], [999, 2316], [1040, 2299], [1057, 2316], [1135, 2316], [1161, 2336], [1196, 2313], [1216, 2330], [1225, 2388], [1205, 2405], [1170, 2388], [1127, 2405], [1083, 2403], [1066, 2388], [1037, 2414], [968, 2391], [950, 2405], [910, 2408], [884, 2397], [864, 2408], [794, 2408], [774, 2394], [736, 2408], [667, 2391], [630, 2423], [580, 2400], [508, 2414], [488, 2403], [413, 2405], [401, 2394], [375, 2408], [343, 2405], [329, 2391], [300, 2414], [274, 2403], [254, 2417], [216, 2385]]}, {"script": "default", "baseline": [[234, 1236], [872, 1233], [1260, 1241]], "boundary": [[234, 1236], [236, 1172], [369, 1169], [398, 1146], [450, 1184], [502, 1149], [549, 1169], [976, 1172], [1028, 1169], [1060, 1146], [1098, 1169], [1170, 1169], [1199, 1192], [1219, 1172], [1239, 1172], [1260, 1241], [1239, 1262], [1205, 1244], [1167, 1262], [1138, 1247], [1127, 1259], [1054, 1259], [1037, 1247], [1011, 1259], [979, 1241], [942, 1262], [907, 1250], [786, 1262], [760, 1244], [708, 1282], [670, 1273], [647, 1250], [635, 1262], [606, 1247], [575, 1262], [554, 1250], [482, 1259], [459, 1236], [361, 1282], [297, 1250], [277, 1262], [236, 1253]]}, {"script": "default", "baseline": [[222, 1603], [416, 1611], [1288, 1611]], "boundary": [[222, 1603], [225, 1542], [358, 1545], [381, 1522], [427, 1545], [476, 1545], [494, 1562], [520, 1542], [728, 1542], [751, 1559], [806, 1522], [852, 1545], [887, 1545], [910, 1525], [959, 1545], [1095, 1542], [1138, 1522], [1210, 1565], [1233, 1542], [1274, 1545], [1288, 1611], [1271, 1634], [1236, 1634], [1216, 1617], [1199, 1631], [1164, 1631], [1144, 1617], [1127, 1631], [1083, 1623], [991, 1634], [973, 1620], [956, 1634], [797, 1634], [765, 1611], [713, 1637], [682, 1614], [653, 1631], [627, 1620], [609, 1631], [546, 1631], [517, 1611], [491, 1634], [427, 1623], [413, 1634], [329, 1634], [306, 1617], [268, 1655], [225, 1640]]}, {"script": "default", "baseline": [[216, 2016], [401, 2010], [1312, 2010]], "boundary": [[216, 2016], [219, 1946], [338, 1943], [361, 1920], [407, 1946], [543, 1943], [563, 1926], [612, 1946], [783, 1946], [809, 1926], [835, 1926], [875, 1946], [1236, 1943], [1265, 1966], [1297, 1943], [1312, 2010], [1306, 2036], [1274, 2018], [1257, 2033], [1115, 2030], [1092, 2044], [1054, 2044], [1031, 2024], [1020, 2036], [962, 2030], [921, 2042], [907, 2030], [780, 2036], [765, 2024], [734, 2036], [713, 2018], [696, 2036], [537, 2033], [525, 2021], [497, 2033], [479, 2018], [450, 2033], [419, 2016], [398, 2033], [303, 2024], [280, 2036], [225, 2016]]}]}
 time -p kraken -i p0015.png p0015-cbad-bl.json binarize segment -i /home/ubuntu/kraken/deva/models/cbad.mlmodel -bl
WARNING: Logging before flag parsing goes to stderr.
W0222 04:34:57.645057 138423110944384 __init__.py:74] TensorFlow version 1.15.0 detected. Last version known to be fully compatible is 1.14.0 .
Loading ANN /home/ubuntu/kraken/deva/models/cbad.mlmodel        ✓
Binarizing      ✓
Segmenting      ✗
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/py36/bin/kraken", line 8, in <module>
    sys.exit(cli())
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 1164, in invoke
    return _process_result(rv)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 1102, in _process_result
    **ctx.params)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/kraken/kraken.py", line 235, in process_pipeline
    task(base_image=base_image, input=input, output=output)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/kraken/kraken.py", line 99, in segmenter
    res = blla.segment(im, text_direction, mask=mask, model=model)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/kraken/blla.py", line 112, in segment
    baselines = vectorize_lines(o)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/kraken/lib/segmentation.py", line 310, in vectorize_lines
    skel, skel_dist_map = medial_axis(bin[1], return_distance=True)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/skimage/morphology/_skeletonize.py", line 402, in medial_axis
    corner_score = _table_lookup(masked_image, cornerness_table)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/skimage/morphology/_skeletonize.py", line 483, in _table_lookup
    if image.shape[0] < 3 or image.shape[1] < 3:
IndexError: tuple index out of range
real 72.60
user 325.55
sys 1.34
Shreeshrii commented 4 years ago

Questions:

  1. Which is the correct command to use for segmentation?

  2. If I want to train for segmentation for Devanagari images, what will be process to follow?

Shreeshrii commented 4 years ago

Is it possible to support tesseract generated ALTO files as input for segmentation training. I am attaching a sample file.

p015.xml.txt

mittagessen commented 4 years ago
  1. Which is the correct command to use for segmentation?

There was an auto-switch regression where to new segmenter wasn't selected despite the warning to the opposite. I fixed it a couple of days ago in the trainable segmenter branch with region support but that isn't ready for merging, yet. Just using the -bl switch to explicitly select the new segmenter is correct.

The blla.mlmodel you used was trained on an older state of the code and isn't compatible anymore. There will be another deprecation of existing models with the merging of the regions branch, necessitated by the explicit coding of line direction in there.

I've got some models trained on a larger dataset with augmentation soewhere. Let me dig them up.

  1. If I want to train for segmentation for Devanagari images, what will be process to follow?

The simplest way is to have a bunch of PageXML files or ALTOs with baseline information (scheduled for standard inclusion with the next revision. kraken/escriptorium output is compatible already, everything else most likely not). Then you just point ketos segtrain to them and wait for a (long without GPU) while:

ketos segtrain -f page -N 100 -q dumb --augment -o seg_model *.xml

or for ALTO:

ketos segtrain -f alto -N 100 -q dumb --augment -o seg_model *.xml

There's also a legacy path format which is just a JSON file with a list of polylines.

MCC values of around 0.7+ on the validation set are decent. They are not correlated to the actual segmentation accuracy, i.e. quite a bit lower.

You use the output models as you've used the previous ones.

For your tesseract output: The ALTO is not compatible (no baselines) but I did a short test with semi-manually harvested training data of random archive.org documents using tesseract's hocr output (its segmenter is overall worse than the old kraken segmenter but its noise level is lower so there are more 'perfect' pages). There's a script converting hocr to the path format at [0]. You might want to use it to quickly produce training data. Be aware though that tesseract's baseline estimations can be quite a bit off; they are frequently placed 3-5 pixels below the actual one which can cause polygonization errors.

[0] http://br.unchti.me/convert_hocr.py

Shreeshrii commented 4 years ago

I've got some models trained on a larger dataset with augmentation somewhere. Let me dig them up.

Thanks, that will be great.

There's a script converting hocr to the path format at [0]. You might want to use it to quickly produce training data. Be aware though that tesseract's baseline estimations can be quite a bit off; they are frequently placed 3-5 pixels below the actual one which can cause polygonization errors.

This will be helpful in trying to build segmenter based on Devanagari. Is there a way to change the script to handle the error in tesseract's baseline calculation.

Questions: How many page images should I use for segtrain?

Can I use offsplit.mlmodel as base to continnue from?

mittagessen commented 4 years ago

I haven't looked into how that error happens or if it is systematic and therefore fixable. That whole thing was an experiment of a few hours to see if I can bootstrap good enough training data by just rejecting all erroneous segmentation output of an existing OCR engine. The easiest way would be to simply adjust the baseline upwards/downwards by a few pixels, subject on tesseract's placement on the actual top Devanagari baseline or at the bottom. From our experiments with Hebrew both work and kraken doesn't really care where exactly the 'baseline' is; being somewhere between the actual base- and mean line is sufficient to get robust polygonization.

How many page images should I use for segtrain?

Anywhere between 50 and 400 seems to produce state of the art or slightly better results.

Can I use offsplit.mlmodel as base to continnue from?

Yes. Same syntax as with ketos train, just input an existing model.

Shreeshrii commented 4 years ago

There will be another deprecation of existing models with the merging of the regions branch, necessitated by the explicit coding of line direction in there.

Does that mean that the Devanagari model I trained on line images cannot be used with the new development version of kraken?

The new segmenter fixes all that by being trainable but it works fundamentally differently and the ketos transcribe/ketos extract workflow won't be adapted for it.

Will there be another way to review/correct the page level ground truth?

mittagessen commented 4 years ago

Does that mean that the Devanagari model I trained on line images cannot be used with the new development version of kraken?

Recognition models trained on bounding box data do not work with baseline segmenter output. They are still supported but you're stuck with the old segmenter (and if you try anyway you get a scary warning and crap output). I've been looking into semi-supervised transfer learning and there might be an avenue to use that for adaptation without having to create new training data but it isn't a particularly high priority.

Segmentation models trained on blla won't work anymore once I merge the region detection code in blla_regions back. As we mostly know everybody who trained one up to now and the number is small there isn't a particular need to preserve backward compatibility with a development branch.

Will there be another way to review/correct the page level ground truth?

escriptorium is what we use for that but it is much more (digitization platform with annotation support). There probably won't be a replacement in kraken directly as nobody here wants to do it with a working alternative. If somebody else sends a pull request it can get merged though.

Shreeshrii commented 4 years ago

Thanks! I will wait for the new release.

wrznr commented 4 years ago

The simplest way is to have a bunch of PageXML files or ALTOs with baseline information

Many available GT sets do not contain baseline information but rather polygonal line shapes. Is it possible to somehow generate baseline information from polygons? I.e. the reversal of the baseline to polygon transformation which is performed after the baseline detection.

mittagessen commented 4 years ago

Many available GT sets do not contain baseline information but rather polygonal line shapes. Is it possible to somehow generate baseline information from polygons? I.e. the reversal of the baseline to polygon transformation which is performed after the baseline detection.

Unfortunately not yet. On relatively clean writing or print you could probably do a fairly reliable estimation with some filtering in line of what the CenterNormalizer in the old box processing pipeline does but nobody has tried that as far as I know.

dstoekl commented 4 years ago

I did the following: gauss filter, find best rotation angle (with greatest horizontal maximum between two angles, e.g. -3 and +3), then the two dots of coordinates of maximum profile on both ends. Such baselines would always be straight, of course.