fourMs / MGT-python

Musical Gestures Toolbox for Python
https://www.uio.no/ritmo/english/research/labs/fourms/downloads/software/musicalgesturestoolbox/mgt-python/index.html
GNU General Public License v3.0
52 stars 11 forks source link

Read/write original files as arrays, and a completely memory-based processing flow #294

Open fisheggg opened 1 year ago

fisheggg commented 1 year ago

Read/write original files as arrays

For further processing and training purposes, I think it is nice to be able to read the original files of MgVideo and MgAudio objects as arrays.

I understand that this could be done by opencv or librosa, but a shortcut function would be handy, for example:

video = musicalgestures.MgVideo('/path/to/a/video.mp4')

video_array, fps = video.numpy() # returns a numpy array of the video file

On the other hand, we can also consider to add the function that inits an MgVideo object from an array, for example:

video_array, fps = video.numpy()

new_object = musicalgesture.MgVideo(array=video_array, fps=fps, path='your/cutsom/path', filename='your_filename.avi') # this inits a new MgVideo object and saves the array as a file, maybe have a default path and filename if not specified

The function name, style, parameters and default values need to be further discussed.

Memory-based processing flow

Current workflow creates a new file after each line of processing is done, this might become a issue if we run a batch process of a large dataset, say 100,000 video files (although I guess other performance issues are more critical in such a situation...)

So I suggest we could have a flag that tells the program not to save a new file, but return an array instead. For example:

video = musicalgestures.MgVideo('/path/to/a/video.mp4')
video_grid = video.grid(height=300, rows=1, cols=9, return_array=True) # video_grid will be a numpy array, and no files will be created

In fact, I would recommend using memory-based flow as a default, since it's safer in terms of storage management. I would suggest only creating new files when the user flags that they want so.

balintlaczko commented 1 year ago

Very good suggestions, I agree the file-based workflow get generate a lot of clutter if you don't manage it proactively, and that can become tedious with large batches of input files. Originally one of the issues I thought of is that you might not have enough RAM to buffer in entire videos as uncompressed matrices – one example @alexarje often brought up was the Bergensbanen 8h video of a train's dash cam going from Bergen to Oslo that he used for some examples. But I think that one is probably more of an exception in which case the user could specify to render files instead, and probably in most "everyday" cases the memory-based workflow would have much less inertia + would connect well to other CV-related workflows. The only thing is that with the current FFmpeg backend it might take some experimentation to get the piping right, I haven't really used it before like this, but I know it's possible.

alexarje commented 1 year ago

Yes, there are some good points here. Personally, I find it very practical to work with files written to disk. It works well for large files that don't fit in memory (I often work with hour-long files) and saves me if/when things crash along the way and I don't need to start everything again. But I see that it would be practical with an option to choose whether to write files to disk. How much work would it entail to add such an option @balintlaczko?

balintlaczko commented 1 year ago

I can imagine that _utils.ffmpeg_cmd could be modified to pipe output that we in turn read in as a numpy array, either frame-by-frame or the whole video at once, depending on the use case and memory constraints. There is a promising thread about this here. If we could implement it right in that function, then it might not take that long, actually.

joachimpoutaraud commented 10 months ago

Thanks for the cool suggestions @fisheggg!

I have added a new numpy() function to the MgVideo class to load the video frames as an array with FFmpeg. For that, I have also updated the ffmpeg_cmd function in order to be able to either 'read' the video frame by frame or 'load' all the frames in memory.

Now, it is possible to load the video as an array by doing so:

video = musicalgestures.MgVideo('/path/to/a/video.mp4')
video_array, fps = video.numpy() # returns a numpy array of the video file 
joachimpoutaraud commented 9 months ago

I have added the possibility to generate an MgVideo object from an array. For example, now if you write the original video file as an array:

video = musicalgestures.MgVideo('/path/to/a/video.mp4')
video_array, fps = video.numpy() # returns a numpy array of the video file 

You can then generates a new MgVideo object from the following array by doing so:

# Generates a new MgVideo object and saves the array as a video file
new_mg = musicalgesture.MgVideo(filename='your_filename.avi', array=video_array, fps=fps, path='your/cutsom/path') 

Finally, I have also updated the _grid.py script with a memory-based processing flow. Now, it is possible to return an array instead of writing the video file to disk.

from matplotlib import pyplot as plt

video = musicalgestures.MgVideo('/path/to/a/video.mp4')
# video_grid will be a numpy array, and no files will be created
video_grid = video.grid(height=300, rows=3, cols=3, return_array=True) 

# Plot the grid image
plt.figure(figsize=(40, 5))
plt.imshow(video_grid) 
plt.show()