Open fisheggg opened 1 year ago
Very good suggestions, I agree the file-based workflow get generate a lot of clutter if you don't manage it proactively, and that can become tedious with large batches of input files. Originally one of the issues I thought of is that you might not have enough RAM to buffer in entire videos as uncompressed matrices – one example @alexarje often brought up was the Bergensbanen 8h video of a train's dash cam going from Bergen to Oslo that he used for some examples. But I think that one is probably more of an exception in which case the user could specify to render files instead, and probably in most "everyday" cases the memory-based workflow would have much less inertia + would connect well to other CV-related workflows. The only thing is that with the current FFmpeg backend it might take some experimentation to get the piping right, I haven't really used it before like this, but I know it's possible.
Yes, there are some good points here. Personally, I find it very practical to work with files written to disk. It works well for large files that don't fit in memory (I often work with hour-long files) and saves me if/when things crash along the way and I don't need to start everything again. But I see that it would be practical with an option to choose whether to write files to disk. How much work would it entail to add such an option @balintlaczko?
I can imagine that _utils.ffmpeg_cmd could be modified to pipe output that we in turn read in as a numpy array, either frame-by-frame or the whole video at once, depending on the use case and memory constraints. There is a promising thread about this here. If we could implement it right in that function, then it might not take that long, actually.
Thanks for the cool suggestions @fisheggg!
I have added a new numpy()
function to the MgVideo
class to load the video frames as an array with FFmpeg
. For that, I have also updated the ffmpeg_cmd
function in order to be able to either 'read'
the video frame by frame or 'load'
all the frames in memory.
Now, it is possible to load the video as an array by doing so:
video = musicalgestures.MgVideo('/path/to/a/video.mp4')
video_array, fps = video.numpy() # returns a numpy array of the video file
I have added the possibility to generate an MgVideo
object from an array. For example, now if you write the original video file as an array:
video = musicalgestures.MgVideo('/path/to/a/video.mp4')
video_array, fps = video.numpy() # returns a numpy array of the video file
You can then generates a new MgVideo
object from the following array by doing so:
# Generates a new MgVideo object and saves the array as a video file
new_mg = musicalgesture.MgVideo(filename='your_filename.avi', array=video_array, fps=fps, path='your/cutsom/path')
Finally, I have also updated the _grid.py
script with a memory-based processing flow. Now, it is possible to return an array instead of writing the video file to disk.
from matplotlib import pyplot as plt
video = musicalgestures.MgVideo('/path/to/a/video.mp4')
# video_grid will be a numpy array, and no files will be created
video_grid = video.grid(height=300, rows=3, cols=3, return_array=True)
# Plot the grid image
plt.figure(figsize=(40, 5))
plt.imshow(video_grid)
plt.show()
Read/write original files as arrays
For further processing and training purposes, I think it is nice to be able to read the original files of
MgVideo
andMgAudio
objects as arrays.I understand that this could be done by opencv or librosa, but a shortcut function would be handy, for example:
On the other hand, we can also consider to add the function that inits an
MgVideo
object from an array, for example:The function name, style, parameters and default values need to be further discussed.
Memory-based processing flow
Current workflow creates a new file after each line of processing is done, this might become a issue if we run a batch process of a large dataset, say 100,000 video files (although I guess other performance issues are more critical in such a situation...)
So I suggest we could have a flag that tells the program not to save a new file, but return an array instead. For example:
In fact, I would recommend using memory-based flow as a default, since it's safer in terms of storage management. I would suggest only creating new files when the user flags that they want so.