Unexpected accuracy -100.0%

stweil commented 1 year ago

Several newly trained models show an accuracy of -100.0% in eScriptorium. It looks like that value comes from unexpected user metadata in the model file. Manual test:

(venv3.11) stweil@notebook11 kraken % python                                                                                                                      
Python 3.11.3 (main, Apr  7 2023, 20:13:31) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from kraken.lib import vgsl
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Torch version 2.1.0.dev20230511 has not been tested with coremltools. You may run into unexpected errors. Torch 2.0.0 is the most recent version that has been tested.
>>> model = vgsl.TorchVGSLModel.load_model('german_handwriting_5.mlmodel')
>>> print(model.user_metadata.get('accuracy'))
[[53882, 0.9601333737373352], [107764, 0.9726133346557617], [161646, 0.9768959879875183], [215528, 0.9798051714897156], [269410, 0.9799874424934387], [323292, 0.9827965497970581], [377174, 0.9840622544288635], [431056, 0.983633816242218], [484938, 0.9844568371772766], [538820, 0.9841548800468445], [592702, 0.9763985872268677], [646584, 0.9846234321594238], [700466, 0.984947919845581], [754348, 0.9845662713050842], [808230, 0.984584391117096], [862112, 0.9842845797538757], [915994, 0.9849886298179626], [969876, 0.9855098724365234], [1023758, 0.9843036532402039], [1077640, 0.9851313233375549], [1131522, 0.9844676852226257], [1185404, 0.985547661781311], [1239286, 0.9851428270339966], [1293168, 0.9853638410568237], [1347050, 0.9848154187202454], [1400932, 0.9842344522476196], [1454814, 0.9849626421928406], [1508696, 0.9854357838630676], [1562578, 0.9855362772941589], [1616460, 0.985099196434021], [1670342, 0.985232412815094], [1724224, 0.9847443103790283], [1778106, 0.9846210479736328], [1831988, 0.9856811165809631], [1885870, 0.9852061867713928], [1939752, 0.9852434992790222], [1993634, 0.9850150346755981], [2047516, 0.984273374080658], [2101398, 0.9845644235610962], [2155280, 0.9852415323257446], [2209162, 0.9855249524116516], [2263044, 0.9852496385574341], [2316926, 0.9856259822845459], [2370808, 0.9850746989250183], [2424690, 0.9851294755935669], [2478572, 0.9850216507911682], [2532454, 0.9849865436553955], [2586336, 0.9848904013633728], [2640218, 0.985207736492157], [2694100, 0.9849650859832764], [2747982, 0.9851065874099731], [2801864, 0.9843791723251343], [2855746, 0.9858119487762451], [2630, 0.0], [5260, 0.0], [7890, 0.945189356803894], [10520, 0.9733895659446716], [13150, 0.9820293188095093], [15780, 0.9845866560935974], [18410, 0.9870749115943909], [21040, 0.987489640712738], [23670, 0.9883190393447876], [26300, 0.9888028502464294], [28930, 0.9891484379768372], [31560, 0.9899778962135315], [34190, 0.9899778962135315], [36820, 0.9901161193847656], [39450, 0.9894249439239502], [42080, 0.9899778962135315], [44710, 0.9901161193847656], [47340, 0.9910838007926941], [61426, 0.9903409481048584], [122852, 0.9919184446334839], [184278, 0.992790699005127], [245704, 0.9931191205978394], [307130, 0.9933667778968811], [368556, 0.9934529066085815], [429982, 0.9938082695007324], [491408, 0.9938352108001709], [552834, 0.9935013651847839], [614260, 0.9940074682235718], [675686, 0.9941797852516174], [737112, 0.9941205382347107], [798538, 0.9942767024040222], [859964, 0.9944705367088318], [921390, 0.9943789839744568], [982816, 0.9941205382347107], [1044242, 0.9941797852516174], [1105668, 0.9943466782569885], [1167094, 0.9944866895675659], [1228520, 0.9944005608558655], [1289946, 0.9943897724151611], [1351372, 0.9939051866531372], [1412798, 0.994524359703064], [1474224, 0.9945674538612366], [1535650, 0.9945351481437683], [1597076, 0.994562029838562], [1658502, 0.9945674538612366], [1719928, 0.9945458769798279], [1781354, 0.9945351481437683], [1842780, 0.9945189952850342], [1904206, 0.9945135712623596], [1965632, 0.9945082068443298], [2027058, 0.9941366910934448], [2088484, 0.9942067265510559], [2149910, 0.9944543838500977], [2211336, 0.994615912437439], [2272762, 0.9945135712623596], [2334188, 0.994653582572937], [2395614, 0.9943413138389587], [2457040, 0.994745135307312], [6363, 0.9858980774879456], [12726, 0.9946515560150146], [19089, 0.9957094192504883], [25452, 0.9964703321456909], [31815, 0.996807336807251], [38178, 0.9973708987236023], [44541, 0.9972986578941345], [50904, 0.9973924160003662], [57267, 0.99732506275177], [63630, 0.9974349141120911], [69993, 0.9973911046981812], [76356, 0.9974782466888428], [82719, 0.9975510835647583], [89082, 0.9977064728736877], [95445, 0.9978642463684082], [101808, 0.9979320168495178], [108171, 0.9977737069129944], [114534, 0.9977526068687439], [120897, 0.9976613521575928], [127260, 0.9977280497550964], [133623, 0.9976180195808411], [139986, 0.997887372970581], [146349, 0.9976814985275269], [152712, 0.9978420734405518], [159075, 0.9978851079940796], [165438, 0.9980180859565735], [171801, 0.9980669021606445], [178164, 0.997996985912323], [184527, 0.9982700943946838], [22331, -1.0], [44662, -1.0], [66993, -1.0], [89324, -1.0], [111655, -1.0], [133986, -1.0]]
>>> print(model.user_metadata.get('accuracy')[-1])
[133986, -1.0]
>>> print(model.user_metadata.get('accuracy')[-1][1]*100)
-100.0
>>> print(model.user_metadata.get('metrics'))
[[22331, {'train_loss': 166.64639282226562, 'val_accuracy': 0.7586325407028198, 'val_word_accuracy': 0.37506699562072754}], [44662, {'train_loss': 41.28871154785156, 'val_accuracy': 0.8102050423622131, 'val_word_accuracy': 0.4907914996147156}], [66993, {'train_loss': 82.823486328125, 'val_accuracy': 0.8399950861930847, 'val_word_accuracy': 0.5554498434066772}], [89324, {'train_loss': 34.19955062866211, 'val_accuracy': 0.8479354977607727, 'val_word_accuracy': 0.5946530103683472}], [111655, {'train_loss': 66.23241424560547, 'val_accuracy': 0.8676846027374268, 'val_word_accuracy': 0.6040084958076477}], [133986, {'train_loss': 19.33416175842285, 'val_accuracy': 0.8747826814651489, 'val_word_accuracy': 0.6380583047866821}]]

stweil commented 1 year ago

Sorry, I just noticed that issue #440 already reported the same problem.

stweil commented 1 year ago

Related kraken code:

            metric = float(trainer.logged_metrics['val_metric']) if 'val_metric' in trainer.logged_metrics else -1.0
            trainer.model.nn.user_metadata['accuracy'].append((trainer.global_step, metric))

stweil commented 1 year ago

I fixed my models using this script: https://ub-backup.bib.uni-mannheim.de/~stweil/tesstrain/kraken/mlmodel.py.

mittagessen / kraken

Unexpected accuracy -100.0% #500