musepack support - Githubissues

ghost commented 8 years ago

I am considering writing support for musepack into python audio tools. It's a lossy audio codec that uses APEv2 tags. The latest version of it, SV8, does not have libraries in most Linux distributions that I can see. This means I would likely have to bundle it or expect the end user to compile it themselves if they wanted the support. Thoughts? If you are opposed, then I'll find another solution. But I thought it would make encoding and tagging them easier if I could use python audio tools.

tuffy commented 8 years ago

I remember having Musepack support at one point, but I think I took it out only because mpcenc and mpcdec were a hassle to compile. A not-so-recent version of Ubuntu I'm using has mpcdec 1.0 and mpcenc 1.3 in its repositories which should handle SV8 files. I was using those binaries to handle the audio codec before. Bundling the codec would be easier - especially since Musepack's shared library doesn't seem to expose itself through pkg-config which makes it harder to detect.

I wouldn't mind re-adding the codec if someone can get some use out of it.

ghost commented 8 years ago

I am currently hacking on the latest stable releases from musepack.net to figure out how it could be embedded, and such. I'm also trying to make it buildable with cmake in a proper manner. As it stands, the upstream build scripts were broken. I'll see what is required to bundle the libraries with python audio tools. See here: https://github.com/ryuo

ghost commented 8 years ago

I've started work on it in a branch of my fork, here: https://github.com/ryuo/python-audio-tools/tree/musepack Would you mind giving me feedback as I work on it? I am very familiar with C, as I have about 8 years experience with it, but I am not familiar with python's C API, as I don't use python much.

ghost commented 8 years ago

I have begun to write a standalone encoder for testing the port of the reference encoder to python audio tools. It is in the musepack branch of my fork. But, I have hit a brick wall. The musepack encoder wants the total number of PCM samples available in the WAV input, but the PCM reader struct does not appear to expose it. I know this information is exposed in the WAV file header. Any ideas where I can get this information?

tuffy commented 8 years ago

The readers don't have to supply a total number of PCM frames because the number can't be known in advance for all sources of input, like recording from a microphone or reading from some broken MP3 file where the size estimate is way off. But when the number of total PCM frames is known, encoders can take an optional argument with that amount and optimize their encoding. This is how True Audio operates; if the total size is known, it'll build a dummy seektable using that size, encode the file, verify the amount of PCM frames received matches what was promised, then go back and rewrite the seektable. If the total size isn't known, it'll dump its encoded data to temporary space, build the seektable, and then dump the temporary data back to the main file.

But if Musepack's encoder needs that info, I can dump the reader's input to a temporary file at the Python level if the total length isn't supplied and then supply a known total length to the C-based encoder from the temp file. Not knowing a length in advance is the less common case so it doesn't need to be optimal.

ghost commented 8 years ago

It appears I have finished the core encoding function for musepack. It works in two modes. If the total samples are unknown, pass 0 to it. If it is known, pass the total samples after dividing out channels and bytes per sample from the total size. It has passed both modes of operation in that it produces a file with the same checksum as the reference encoder. Would you care to advise on how I can integrate the standalone mode with your unit testing system? After that, I can look into making it accessible to python. Also, I am still waiting for you to merge my changes to your musepack branch. Thanks.

tuffy commented 8 years ago

The standalone C encoders/decoders are never actually called by Python's unit tests; I only use those builds as something I can easily call with valgrind to check for memory leaks, or to profile with gprof when looking for ways to speed them up.

I could put together a batch of Python-level tests to ensure encoded files match what the reference decoder outputs. The more exhaustive tests for internalized codecs are modeled after FLAC's unit tests, so I'd need to adjust them to work with a lossy codec instead.

And after some computer upgrades on my end, your changes have been merged in and pushed out on my musepack branch now.

ghost commented 8 years ago

I'd appreciate it. I'm not familiar with what constitutes good unit testing for lossy audio encoding. For lossless, it seems easier to do. Since it doesn't throw away data, you can test them by checksumming the output of an encode and decode pass. If they match the original input's checksum, it passes, obviously.

I will give you a heads up on what to expect with the encoder. I did not implement all of advanced options of the reference encoder. I decided it best to only implement what the reference encoder does by default for everything except the quality level. The quality level controls the general bit rate, but I have observed the encoder will dip much lower if the audio is quieter. This seemed like the most realistic implementation path.

As for input files, please note that this encoder only matched the reference encoder when the input file was provided in the same way. What this means is this: if the reference encoder has different input parameters than the python encoder for the same input data, it will produce a different file. The bits per sample, sample rate, and number of samples must match. The reference encoder can only get all this information from the WAV header. If given raw input, it will assume CD audio: 16, 44100, 2, with 24 hours worth of samples. If you are comparing encoding results using WAV input files, be sure to specify the total samples, or the results will not match the output of the reference encoder. In the case of raw input, it must have CD audio parameters and have the sample count specified as 0. I used this fall-back mode as a workaround for when the sample count cannot be known in advance. It uses the same amount the reference encoder does. If you wanted to test other encoding options where the sample count is unknown, I suppose you could hack the WAV file header to lie about the stream. It's the only way I can think of to get it to work with other input parameters for a raw stream.

Thanks.

tuffy commented 8 years ago

I'm not going to attempt to match encoded files to what the reference encoder generates from the same input. If files encoded by audiotools are decoded the same (or very similar, since it's a lossy codec) by both the reference decoder and by audiotools' decoder, that's good enough for me. Since it's basically a port of the reference codec, there shouldn't be a need to hit it as hard as all the lossless codecs that I re-implemented from scratch.

And I don't bother exposing all the possible encoder options to users anyway, so keeping it to a few presets is a fine solution.

ghost commented 8 years ago

Any idea when you will be able to make the encoder available for use in python? I think I already finalized the C to python interface code.

tuffy commented 8 years ago

I've been delayed by the need for some updates for the Debian maintainer, but I should be able to tackle hooking up the Python encoder over the weekend.

tuffy commented 8 years ago

I've wired up the C-based encoder, fixed a few minor things, added some licensing text to files that were missing it, merged the musepack branch and pushed out all the changes.

Haven't had time to put together a set of Musepack-specific tests yet. But since it passes the standard tests, I don't expect any major problems.

tuffy / python-audio-tools

musepack support #59