Hey all! There is a glaring bug with the data_proccess pipeline.
Currently, for each clip, the video generator sorts the data by filename
ep_frame_paths = (glob(os.path.join(ep_dir, '*'))). If your files are written as
filename1
filename2
...
filenameN
and N > 10 this will interlace earlier frames of the data into the clips used for training. With a large history window this adds some robustness to the model, but it really invalidates the modeling assumption!
Sortiing by the last integer fixes this (change line 80 to this)
Hey all! There is a glaring bug with the data_proccess pipeline.
Currently, for each clip, the video generator sorts the data by filename
ep_frame_paths = (glob(os.path.join(ep_dir, '*')))
. If your files are written asand
N > 10
this will interlace earlier frames of the data into the clips used for training. With a large history window this adds some robustness to the model, but it really invalidates the modeling assumption!Sortiing by the last integer fixes this (change line 80 to this)