mvcisback / SSLVC

Sound Source Localization using Visual Cues
4 stars 1 forks source link

Video 1 Re-sized #13

Closed ghost closed 9 years ago

ghost commented 9 years ago

Feel free to move it where it should be. I'm going to upload a bunch of stuff to this link later today.

https://www.dropbox.com/sh/erms26lcem5hv0o/AABYGDXmO5c9g-D-p5L4rCc5a?dl=0

mvcisback commented 9 years ago

Is this just going to be resized videos that are easier to work with?

ghost commented 9 years ago

yeah, I think I will also put up a .mat file that has all the frames and associated audios for different colormaps and you can just pick one and play with that.

ghost commented 9 years ago

@mvcisback and @ffaghri1 okay, so here are some information you need if you use the mat files. I haven't verify them yet. Size of the original video is 200(w)_300(h)_3(colors)_472(frames). size of the video is divided to 472_1473 samples, at sampling rate 44100. If you use YUV,HSV or RGB then all channels are put below each other and vectorized for each frame, that will give you a 180000_472(frames) for all frames. Putting audio below this, which is a mat file I put in the dropbox, gives you 181473_472(frame). So, the first 180000 rows are video pixels and you can reshape it to 200_300_3 to see it and the rest of that is the audio which you can hear it with sampling rate 44100.

ghost commented 9 years ago

oh and gray is obviously 60000_472 video and 472_1473 audio, since there is only one channel.

ffaghri1 commented 9 years ago

@ramili could you tell us how to access the data. I loaded one of the .mat files as video_sound variable with size 472x61473. How do I look into a particular frame or audio part? how is it shaped?

ghost commented 9 years ago

So, there are 472 frames. Each are 200_300_1 gray color. And each audio is 1473 samples per frame, they're energy of the audio signals. So to reshape you say, Reshape(video_sound(i,1:200_300),200,300); %video Plot(video_sound(i, 60000+1:end));%audio If you use the three channels, then it's 200_300*3.

On Wednesday, October 29, 2014, Faraz Faghri notifications@github.com wrote:

@ramili https://github.com/ramili could you tell us how to access the data. I loaded one of the .mat files as video_sound variable with size 472x61473. How do I look into a particular frame or audio part? how is it shaped?

— Reply to this email directly or view it on GitHub https://github.com/mvcisback/SSLVC/issues/13#issuecomment-61042693.

Thanks, Best Regards, Ramin

mvcisback commented 9 years ago

We're going to redo this for new videos.