Read files with ffmpeg (or similar)

simon-r / dr14_t.meter

Compute the DR14 of a given audio file according to the procedure described by the Pleasurize Music Foundation

http://dr14tmeter.sourceforge.net

GNU General Public License v3.0

125 stars 33 forks source link

Read files with ffmpeg (or similar) #8

Closed Hawke closed 9 years ago

Hawke commented 12 years ago

Most of the meter’s time seems to be spent reading/writing wav files.


strace -c python ./dr14_tmeter -p -1 ~/Music/example/
[snip]
Success! 
Elapsed time: 17.66
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 42.54    0.448211       29881        15           wait4 ← waiting for subprocess?
 32.17    0.338954         240      1413           read ← reading resulting wav file?
 16.82    0.177198         503       352           munmap ← removing the mapped memory for the wav file?

It would be better if it just decoded the files into memory using ffmpeg or a similar library.

You may find this code helpful: https://github.com/musicbrainz/picard/tree/master/picard/musicdns

ianmcorvidae commented 12 years ago

An additional note is that it may be possible to use the various options for the file-reading programs to have them print their output to stdout, and then read the wav data over that pipe; avoiding opening a subprocess would probably be the most efficient, though.

simon-r commented 12 years ago

I know the problem. But ffmpeg or similar are not available for python, and python actually support only the wav format.

For the moment it exists only one binding for ffmpeg under python, and it don't seems to be popular and diffused in the Linux distros. http://code.google.com/p/pyffmpeg/

Maybe I try some other method ...

Hawke commented 12 years ago

Hence the suggestion to use the code from Picard -- Picard is written in Python and uses ffmpeg

simon-r commented 12 years ago

I've tested the code, by clocking the various functions.

And the file conversion read procedure don't influence so heavily the total time. Converting and read a file needs about 0.8 sec
For computing the DR we need about ... 8 sec per track. It's seems to be OK.

I'll try to do some test also with my matlab function, but consider that matlab implement a very good multi core engine.

Or we should try to compile numpy with the intel MKL.

simon-r commented 12 years ago

Fixed. There was a problem with python threading an not with ffmpeg.

simon-r commented 12 years ago

Picard generate the audio fingerprint with this external application (written in c++) https://github.com/lalinsky/chromaprint

It don't call directly the ffmpeg libs.

Hawke commented 12 years ago

That’s for AcoustID generation. For the older fingerprint method (PUID/MusicDNS), it uses avcodec (ffmpeg) to decode the file (at least on Linux; On Windows or Mac it uses DirectShow or Quicktime, respectively). https://github.com/musicbrainz/picard/blob/master/picard/musicdns/avcodec.c

simon-r commented 12 years ago

They are always externals libs, So the big problem is binding python with one of these lib. ffmpeg has a bit complex and cryptic structure. ....

Now I've drastically improved the performance of the computation, try the git version, and the application spent a lot of time in waiting the file-system and non in decoding the files and reading the files.

Hawke commented 12 years ago

It still seems very slow to me (at least compared to https://github.com/adiblol/dr_meter ), but you’re right it’s not spent decoding/reading.

simon-r commented 12 years ago

I've revised the code of the dr14 computation ... and now it's really fast! I've seen the program in C .

It's very fast, but C it's also the faster programming language on the 'market' .

Tnx for you provocation .... ;)

Hawke commented 12 years ago

Impressive, this is much better!

I did expect the C to be faster, but now it’s about the difference I would expect to see.

Great work!

simon-r commented 12 years ago

Bad news: pyffmpeg seems to be a dead project and it don't compile; and the code is written in a single file of 2500 lines.

Good news: My code is faster than the official DR meter with waves files and under wine.

simon-r commented 12 years ago

@ reading via pipe

I've done some tests with the pipe and as a result there's no difference in performance: writing in stdout or writing on a file is the same.

The problem is more related with the python subprocess.call than ffmpeg.