fourMs / MGT-python

Musical Gestures Toolbox for Python
https://www.uio.no/ritmo/english/research/labs/fourms/downloads/software/musicalgesturestoolbox/mgt-python/index.html
GNU General Public License v3.0
52 stars 11 forks source link

Region of interest for motiondata() #310

Open finn42 opened 10 months ago

finn42 commented 10 months ago

I'd like to calculate the motion data within a region of frames, maybe specified by the same rules as cropping [width,height,top_left x, topleft_y]. Currently working on assessing movement across sections of an audience and the processing of generated cropped videos of these regions before assessing motion is very slow and introducing undesirable artifacts.

Even better would be an option to specify a grid dividing the frame height and width, with an output of QoM calculated within each square, but I can deal with the efficiency loss of reading the video for each region if that would require too many changes to existing functions.

joachimpoutaraud commented 9 months ago

So, if I understand correctly, so far you've cropped parts of your videos and used them to calculate the QoM afterwards. Could you tell me more about the artifacts you are getting on the cropped video?

As for your suggestion, it could be quite demanding in terms of calculation, yes. But if you think such a function could be relevant for your work we can discuss more about it!

There is btw another suggestion which could be to extract motion vectors from MPEG files to run a new motion analysis on the file (much faster). I have worked a little on this issue but didn't manage to properly get what i wanted so far, but if we manage to do so then it would be easily possible to either specify a specific region of interest with ´FFmpeg´ to get the QoM or compute a divided grid of the extracted motion vectors.

finn42 commented 9 months ago

The videos are in compressed formats because of their size (50 fps, UHD, and many minutes long). Cropping requires re-compressing, and so far all the formats I've tried result in new colour corrections at quasi-regular intervals. (Maybe at key frames, maybe not). So motiondata() extracted QoM on a compressed region looks like this: time in ms, 60 s excerpt The timing of those spikes varies per cropped region, so it's not an artifact of the original video. I can threshold them out, replace them with NaNs, but it's not a perfect process.

As to the other issue, it could be another means of reaching the same information, though the analysis challenge would be around how to combine vectors informatively over set regions, and it sounds like some of the needed information isn't super accessible.