nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs
https://docs.nerf.studio
Apache License 2.0
9.52k stars 1.3k forks source link

Periodically, the training status output columns are disrupted. #1798

Open machenmusik opened 1 year ago

machenmusik commented 1 year ago

Periodically, the training status output columns are disrupted.

Previously IIRC, the train iter and ETA columns would be blank when single value was emitted. Now we get five columns when there should be only four.

Step (% Done)       Train Iter (time)    ETA (time)           Train Rays / Sec
-----------------------------------------------------------------------------------
8590 (28.63%)       60.803 ms            21 m, 41 s           67.81 K
8600 (28.67%)       62.190 ms            22 m, 10 s           66.40 K
8610 (28.70%)       63.692 ms            22 m, 42 s           64.76 K
8611 (28.70%)       1.08 M               64.086 ms            22 m, 50 s           64.40 K
8620 (28.73%)       63.850 ms            22 m, 45 s           64.47 K
8630 (28.77%)       62.291 ms            22 m, 11 s           66.11 K
8640 (28.80%)       62.869 ms            22 m, 22 s           65.59 K
8650 (28.83%)       63.392 ms            22 m, 33 s           64.97 K
8660 (28.87%)       62.935 ms            22 m, 23 s           65.34 K
machenmusik commented 1 year ago

This may be due to the specified order of stats_to_track https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/configs/base_config.py#L86

        writer.EventName.ITER_TRAIN_TIME,
        writer.EventName.TRAIN_RAYS_PER_SEC,
        writer.EventName.CURR_TEST_PSNR,
        writer.EventName.VIS_RAYS_PER_SEC,
        writer.EventName.TEST_RAYS_PER_SEC,
        writer.EventName.ETA,

not matching the displayed column order.

Here is what I am seeing today:

Step (% Done)       Vis Rays / Sec       Train Iter (time)    ETA (time)           Train Rays / Sec
--------------------------------------------------------------------------------------------------------
17611 (17.61%)      356.89 K             100.850 ms           2 h, 18 m, 28 s      40.75 K
1

But more often like this

tep (% Done)       Vis Rays / Sec       Train Iter (time)    ETA (time)           Train Rays / Sec
--------------------------------------------------------------------------------------------------------
17820 (17.82%)      100.164 ms           2 h, 17 m, 11 s      41.07 K

Presumably VIS_RAYS_PER_SEC is not populated for most iterations, but there isn't a blank column. In my case, it appears that CURR_TEST_PSNR and TEST_RAYS_PER_SEC are never seen either.