jasag / Phytoliths-recognition-system

Phytoliths recognition system and labeling tool built in python
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Entrenar el clasificador con los fitolitos reetiquetados #68

Closed jasag closed 7 years ago

alvarag commented 7 years ago

Ha estado todo el fin de semana entrenando, calculo unas 55 horas. Lo he parado ahora, dime qué archivos son los que necesitas para poder comprobar si ha convergido la red.

Finish 219 epoch(es)
step 658 - loss 1.20035982131958 - moving ave loss 1.2626463216835104
step 659 - loss 1.4644320011138916 - moving ave loss 1.2828248896265486
step 660 - loss 1.2036800384521484 - moving ave loss 1.2749104045091086
Finish 220 epoch(es)
step 661 - loss 1.1344131231307983 - moving ave loss 1.2608606763712775
step 662 - loss 1.3718047142028809 - moving ave loss 1.2719550801544377

darkflow.txt

jasag commented 7 years ago

No sé si habrá convergido ya. Puesto que el error puede que todavía no se haya estabilizado y continue bajando. Pero la progresión del error tiene bastante buena apariencia. Por otro lado, el último checkpoint, o estado de la red del que podemos partir, está 12 epochs atrás:

Finish 207 epoch(es)
step 622 - loss 1.8270580768585205 - moving ave loss 1.7482966814331986
step 623 - loss 2.1752519607543945 - moving ave loss 1.790992209365318
step 624 - loss 1.4568427801132202 - moving ave loss 1.7575772664401084
Finish 208 epoch(es)
step 625 - loss 1.6025184392929077 - moving ave loss 1.7420713837253883
Checkpoint at step 625
step 626 - loss 1.7589023113250732 - moving ave loss 1.743754476485357
step 627 - loss 1.3427557945251465 - moving ave loss 1.7036546082893358
Finish 209 epoch(es)

Aun así, la única manera de comprobar si ha convergido es tratando de predecir una imagen. Para ello, puedes abrir una sesión de Python y escribir:

from darkflow.net.build import TFNet
import cv2

options = {"model": "cfg/yolo-1c.cfg", "load": -1, "threshold": 0.0}

tfnet = TFNet(options)

imgcv = cv2.imread("./sample_img/dog.jpg") # Cambiar por ruta de la imagen
result = tfnet.return_predict(imgcv)
print(result)
alvarag commented 7 years ago

El error ha bajado tras esta noche de entrenamiento, pero desconozco si lo suficiente para que haya convergido.

Dataset of 55 instance(s)
Training statistics: 
    Learning rate : 0.001
    Batch size    : 16
    Epoch number  : 1000
    Backup every  : 2000
step 626 - loss 1.6676709651947021 - moving ave loss 1.6676709651947024
step 627 - loss 1.1484426259994507 - moving ave loss 1.6157481312751771
step 628 - loss 1.7458970546722412 - moving ave loss 1.6287630236148836
Finish 1 epoch(es)
step 629 - loss 1.1736420392990112 - moving ave loss 1.5832509251832965
step 630 - loss 1.2588083744049072 - moving ave loss 1.5508066701054575
step 631 - loss 1.138228178024292 - moving ave loss 1.5095488208973409
Finish 2 epoch(es)
step 632 - loss 1.2860959768295288 - moving ave loss 1.4872035364905598
step 633 - loss 1.176522135734558 - moving ave loss 1.4561353964149597
step 634 - loss 1.5030412673950195 - moving ave loss 1.4608259835129658
Finish 3 epoch(es)
step 635 - loss 1.0499980449676514 - moving ave loss 1.4197431896584343
step 636 - loss 1.3721592426300049 - moving ave loss 1.4149847949555914
step 637 - loss 1.463003158569336 - moving ave loss 1.4197866313169658
Finish 4 epoch(es)
step 638 - loss 1.4143834114074707 - moving ave loss 1.4192463093260164
step 639 - loss 1.5967729091644287 - moving ave loss 1.4369989693098577
step 640 - loss 1.020228385925293 - moving ave loss 1.3953219109714012
Finish 5 epoch(es)
step 641 - loss 0.9868143796920776 - moving ave loss 1.354471157843469
step 642 - loss 1.3339840173721313 - moving ave loss 1.3524224437963352
step 643 - loss 1.3491945266723633 - moving ave loss 1.3520996520839381
Finish 6 epoch(es)
step 644 - loss 1.2952287197113037 - moving ave loss 1.3464125588466747
step 645 - loss 1.3464546203613281 - moving ave loss 1.34641676499814
step 646 - loss 0.988682746887207 - moving ave loss 1.3106433631870467
Finish 7 epoch(es)
step 647 - loss 1.2891584634780884 - moving ave loss 1.3084948732161508
step 648 - loss 1.3570483922958374 - moving ave loss 1.3133502251241194
step 649 - loss 1.4354681968688965 - moving ave loss 1.325562022298597
Finish 8 epoch(es)
step 650 - loss 1.1406971216201782 - moving ave loss 1.3070755322307552
step 651 - loss 0.9414606690406799 - moving ave loss 1.2705140459117477
step 652 - loss 1.1589103937149048 - moving ave loss 1.2593536806920633
Finish 9 epoch(es)
step 653 - loss 1.2031090259552002 - moving ave loss 1.253729215218377
step 654 - loss 0.778154730796814 - moving ave loss 1.2061717667762206
step 655 - loss 1.436328649520874 - moving ave loss 1.229187455050686
Finish 10 epoch(es)
step 656 - loss 1.1868335008621216 - moving ave loss 1.2249520596318295
step 657 - loss 0.8978499174118042 - moving ave loss 1.192241845409827
step 658 - loss 1.1208245754241943 - moving ave loss 1.1851001184112637
Finish 11 epoch(es)
step 659 - loss 1.4982023239135742 - moving ave loss 1.2164103389614946
step 660 - loss 1.100494384765625 - moving ave loss 1.2048187435419078
step 661 - loss 0.8753302097320557 - moving ave loss 1.1718698901609226
Finish 12 epoch(es)
step 662 - loss 1.0556315183639526 - moving ave loss 1.1602460529812257
step 663 - loss 1.1884920597076416 - moving ave loss 1.1630706536538673
step 664 - loss 1.7889864444732666 - moving ave loss 1.2256622327358073
Finish 13 epoch(es)
step 665 - loss 0.9728714227676392 - moving ave loss 1.2003831517389905
step 666 - loss 0.8856911659240723 - moving ave loss 1.1689139531574986
step 667 - loss 1.8094885349273682 - moving ave loss 1.2329714113344856
Finish 14 epoch(es)
step 668 - loss 1.1515730619430542 - moving ave loss 1.2248315763953426
step 669 - loss 1.2051117420196533 - moving ave loss 1.2228595929577737
step 670 - loss 1.1600005626678467 - moving ave loss 1.2165736899287811
Finish 15 epoch(es)
step 671 - loss 1.1125149726867676 - moving ave loss 1.2061678182045796
step 672 - loss 0.7482483386993408 - moving ave loss 1.1603758702540556
step 673 - loss 1.2837648391723633 - moving ave loss 1.1727147671458864
Finish 16 epoch(es)
step 674 - loss 1.4482853412628174 - moving ave loss 1.2002718245575796
step 675 - loss 1.4650962352752686 - moving ave loss 1.2267542656293486
step 676 - loss 0.8582463264465332 - moving ave loss 1.189903471711067
Finish 17 epoch(es)
step 677 - loss 0.9937052726745605 - moving ave loss 1.1702836518074164
step 678 - loss 1.317679762840271 - moving ave loss 1.185023262910702
step 679 - loss 1.4518811702728271 - moving ave loss 1.2117090536469144
Finish 18 epoch(es)
step 680 - loss 1.2353205680847168 - moving ave loss 1.2140702050906946
step 681 - loss 1.1084346771240234 - moving ave loss 1.2035066522940276
step 682 - loss 0.6354346871376038 - moving ave loss 1.1466994557783854
Finish 19 epoch(es)
step 683 - loss 1.0811960697174072 - moving ave loss 1.1401491171722875
step 684 - loss 1.1103581190109253 - moving ave loss 1.1371700173561514
step 685 - loss 1.0451881885528564 - moving ave loss 1.127971834475822
Finish 20 epoch(es)
step 686 - loss 1.0672621726989746 - moving ave loss 1.1219008682981373
step 687 - loss 0.8446641564369202 - moving ave loss 1.0941771971120156
step 688 - loss 1.2623698711395264 - moving ave loss 1.1109964645147667
Finish 21 epoch(es)
step 689 - loss 1.1555347442626953 - moving ave loss 1.1154502924895597
step 690 - loss 0.9396370053291321 - moving ave loss 1.097868963773517
step 691 - loss 0.9744800329208374 - moving ave loss 1.0855300706882491
Finish 22 epoch(es)
step 692 - loss 0.918732762336731 - moving ave loss 1.0688503398530973
step 693 - loss 1.8871550559997559 - moving ave loss 1.1506808114677631
step 694 - loss 1.3419896364212036 - moving ave loss 1.1698116939631071
Finish 23 epoch(es)
step 695 - loss 1.569258689880371 - moving ave loss 1.2097563935548334
step 696 - loss 1.5636464357376099 - moving ave loss 1.2451453977731113
step 697 - loss 1.0076872110366821 - moving ave loss 1.2213995790994685
Finish 24 epoch(es)
step 698 - loss 0.6653344035148621 - moving ave loss 1.1657930615410077
step 699 - loss 0.9416245818138123 - moving ave loss 1.1433762135682881
step 700 - loss 1.1725890636444092 - moving ave loss 1.1462974985759002
Finish 25 epoch(es)
step 701 - loss 1.0734314918518066 - moving ave loss 1.139010897903491
step 702 - loss 1.3130245208740234 - moving ave loss 1.1564122602005444
step 703 - loss 1.3894544839859009 - moving ave loss 1.1797164825790802
Finish 26 epoch(es)
step 704 - loss 0.9773670434951782 - moving ave loss 1.15948153867069
step 705 - loss 1.2913107872009277 - moving ave loss 1.1726644635237138
step 706 - loss 0.8727515935897827 - moving ave loss 1.1426731765303206
Finish 27 epoch(es)
step 707 - loss 1.3229126930236816 - moving ave loss 1.1606971281796568
step 708 - loss 0.7686212062835693 - moving ave loss 1.1214895359900479
step 709 - loss 1.1714439392089844 - moving ave loss 1.1264849763119416
Finish 28 epoch(es)
step 710 - loss 0.8261069059371948 - moving ave loss 1.0964471692744668
step 711 - loss 1.3161640167236328 - moving ave loss 1.1184188540193833
step 712 - loss 0.9435366988182068 - moving ave loss 1.1009306384992656
Finish 29 epoch(es)
step 713 - loss 1.3426152467727661 - moving ave loss 1.1250990993266157
step 714 - loss 0.6713830232620239 - moving ave loss 1.0797274917201565
step 715 - loss 1.419126033782959 - moving ave loss 1.1136673459264368
Finish 30 epoch(es)
step 716 - loss 1.1171205043792725 - moving ave loss 1.1140126617717203
step 717 - loss 1.0077308416366577 - moving ave loss 1.1033844797582142
step 718 - loss 0.9660260677337646 - moving ave loss 1.0896486385557693
Finish 31 epoch(es)
step 719 - loss 1.3473656177520752 - moving ave loss 1.1154203364753998
step 720 - loss 1.0774872303009033 - moving ave loss 1.1116270258579504
step 721 - loss 0.9902434945106506 - moving ave loss 1.0994886727232203
Finish 32 epoch(es)
step 722 - loss 1.0409586429595947 - moving ave loss 1.0936356697468577
step 723 - loss 1.0470255613327026 - moving ave loss 1.088974658905442
step 724 - loss 1.011224389076233 - moving ave loss 1.0811996319225212
Finish 33 epoch(es)
step 725 - loss 0.9826191067695618 - moving ave loss 1.0713415794072252
step 726 - loss 0.8237358331680298 - moving ave loss 1.0465810047833057
step 727 - loss 1.0767254829406738 - moving ave loss 1.0495954525990425
Finish 34 epoch(es)
step 728 - loss 1.0129140615463257 - moving ave loss 1.0459273134937708
step 729 - loss 0.8685472011566162 - moving ave loss 1.0281893022600552
step 730 - loss 1.2821836471557617 - moving ave loss 1.053588736749626
Finish 35 epoch(es)
step 731 - loss 1.1503524780273438 - moving ave loss 1.0632651108773978
step 732 - loss 1.0340855121612549 - moving ave loss 1.0603471510057836
step 733 - loss 0.9380618333816528 - moving ave loss 1.0481186192433705
Finish 36 epoch(es)
step 734 - loss 1.0509109497070312 - moving ave loss 1.0483978522897366
step 735 - loss 0.9354916214942932 - moving ave loss 1.0371072292101924
step 736 - loss 0.8736528158187866 - moving ave loss 1.0207617878710518
Finish 37 epoch(es)
step 737 - loss 1.2723932266235352 - moving ave loss 1.0459249317463002
step 738 - loss 1.0912797451019287 - moving ave loss 1.050460413081863
step 739 - loss 0.5038421154022217 - moving ave loss 0.995798583313899
Finish 38 epoch(es)
step 740 - loss 0.9432724118232727 - moving ave loss 0.9905459661648364
step 741 - loss 0.8977385759353638 - moving ave loss 0.9812652271418891
step 742 - loss 1.8708975315093994 - moving ave loss 1.0702284575786403
Finish 39 epoch(es)
step 743 - loss 0.6632291078567505 - moving ave loss 1.0295285226064512
step 744 - loss 1.280716896057129 - moving ave loss 1.054647359951519
step 745 - loss 1.2218983173370361 - moving ave loss 1.0713724556900708
Finish 40 epoch(es)
step 746 - loss 0.8472650647163391 - moving ave loss 1.0489617165926977
step 747 - loss 1.1524229049682617 - moving ave loss 1.059307835430254
step 748 - loss 1.3751821517944336 - moving ave loss 1.090895267066672
Finish 41 epoch(es)
step 749 - loss 1.1916924715042114 - moving ave loss 1.100974987510426
step 750 - loss 0.9863177537918091 - moving ave loss 1.0895092641385644
Checkpoint at step 750
step 751 - loss 0.9178691506385803 - moving ave loss 1.072345252788566
Finish 42 epoch(es)
step 752 - loss 0.8660867214202881 - moving ave loss 1.0517193996517382
step 753 - loss 0.9307570457458496 - moving ave loss 1.0396231642611493
step 754 - loss 0.7467097640037537 - moving ave loss 1.0103318242354098
Finish 43 epoch(es)
step 755 - loss 1.4414421319961548 - moving ave loss 1.0534428550114843
step 756 - loss 1.073330044746399 - moving ave loss 1.0554315739849758
step 757 - loss 0.9791052937507629 - moving ave loss 1.0477989459615547
Finish 44 epoch(es)
step 758 - loss 1.067305326461792 - moving ave loss 1.0497495840115785
step 759 - loss 1.0882422924041748 - moving ave loss 1.0535988548508382
step 760 - loss 1.1130471229553223 - moving ave loss 1.0595436816612867
Finish 45 epoch(es)
step 761 - loss 0.9314154982566833 - moving ave loss 1.0467308633208263
step 762 - loss 1.0868018865585327 - moving ave loss 1.050737965644597
step 763 - loss 0.7653579711914062 - moving ave loss 1.022199966199278
Finish 46 epoch(es)
step 764 - loss 1.1119511127471924 - moving ave loss 1.0311750808540694
step 765 - loss 1.0932735204696655 - moving ave loss 1.037384924815629
step 766 - loss 0.8972927331924438 - moving ave loss 1.0233757056533106
Finish 47 epoch(es)
Terminado

Adjunto en el Dropbox el último checkpoint.

alvarag commented 7 years ago

Puesto que en unas 12 horas sólo ha grabado un checkpoint, creo que igual deberíamos hacer copia cada menos tiempo, ¿cuál es la opción? Igual backup cada 1000 sería más aconsejable.

jasag commented 7 years ago

Si. O incluso cada menos. Es un poco en función de cuanto almacenamiento libre tengas en tu ordenador y cada cuanto se completa una epoch en el entrenamiento, o al menos yo lo calculo en función de eso.

Pongamos que quieres hacer una copia por hora y tu ordenador es capaz de completar 60 epochs por hora. Como lo estamos entrenando con 55 imágenes, en cada epoch utilizas las 55 imágenes para el entrenamiento. Y en la opción --save indicas cada cuantas imágenes procesadas realizas la copia. Por lo tanto, tendrías que indicar 60*55 = 3300.

alvarag commented 7 years ago

Luego lo lanzo de nuevo, ¿ha habido mejoría con lo entrenado durante esta noche?

jasag commented 7 years ago

Si. Estoy tratando de entrenar la versión tiny. Y en una sola noche ha llegado a un error entorno a 0.1. Pero todavía no es capaz de obtener etiquetas. Hay que tener en cuenta que he realizado sobre unas 700 epochs. Por lo que reentrenare dicho modelo durante estos días para tratar de bajar el error a un valor cercano a cero.

No me da mucha confianza la versión tiny, pero es la única manera a partir de la cual puedo llegar a un número más significativo de epochs.

jasag commented 7 years ago

Mañana trato de adjuntar, por aquí, parte de la salida del entrenamiento. Para hacerlo un poco más visible.

alvarag commented 7 years ago

He esperado hasta que ha completado el checkpoint:

Finish 49 epoch(es)
step 898 - loss 1.3402414321899414 - moving ave loss 0.9211810019262008
step 899 - loss 0.7466658353805542 - moving ave loss 0.9037294852716362
step 900 - loss 0.4975850284099579 - moving ave loss 0.8631150395854684
Finish 50 epoch(es)
step 901 - loss 0.7326613664627075 - moving ave loss 0.8500696722731924
step 902 - loss 1.0721814632415771 - moving ave loss 0.8722808513700309
step 903 - loss 0.7741212844848633 - moving ave loss 0.8624648946815141
Finish 51 epoch(es)
step 904 - loss 1.0873507261276245 - moving ave loss 0.8849534778261252
step 905 - loss 0.959801971912384 - moving ave loss 0.8924383272347511
Checkpoint at step 905

En el dropbox lo dejo.