ellington-project / ellington

Automated tempo estimation for swing dance DJs
GNU General Public License v3.0
10 stars 0 forks source link

One-shot tool for integration with external audio players #14

Closed virtuald closed 5 years ago

virtuald commented 6 years ago

As a developer of an audio player, I'd like to integrate Ellington into my application to do BPM autodetection.

The best implementation would be a single static binary with no external dependencies that could be executed by the audio player. However, a nonstatic binary would be ok if the dependencies were manageable.

Some requirements I can think of for a player/platform agnostic interface with an executable tool:

One thing that an external audio player definitely would not want is ellington modifying the audio file that was passed in. Any audio player is going to have its own way of managing audio file tags, and that should be left up to the application. Additionally, I imagine that the audio player would want to prompt the user whether the autodetected value is acceptable.

AdamBrouwersHarries commented 6 years ago

I think this could be broken down into a number of sub-goals that would help make ellington more usable from other applications:

Dependency management:

Removing ffmpeg should be the "simplest" solution, as one simply needs to write bindings to libffmpeg (or libav), and call that from Rust. It's definitely something I can do, but it will take a while - which is why I went for the easy option, just calling ffmpeg. This isn't actually that strange of an idea - it's what python's audioread library does, but that doesn't mean it's a good idea.

Removing the other two dependencies will be more difficult, as they're based on somewhat less well maintained libraries which will be harder to build and integrate. It should be possible, but as with ffmpeg, it will be a fairly involved effort.

JSON output:

This should be fairly easy, as an ellington library is json anyway. All that we need to do is print it to the command line. Additionally, streaming library data (as opposed to writing it to disk) is a feature target for 0.2.0, so it's in the works.

No-library operation

This is (I think) the task that needs most design work. As I explained in the other issue (#12), the program is designed to be invoked in stages, as it makes it flexible and extensible. It's going to be difficult to retain that flexibility in a single command line program, unless we have some internal pipeline description methodology that is easily configured from the command line. I can't think of any tools that have something similar (apart from maybe ffmpeg), and very few tools that have intuitive, easy to use, interfaces.

One final point:

One thing that an external audio player definitely would not want is ellington modifying the audio file that was passed in.

I have to disagree with this. I don't know how (say) Gstreamer does it, but iTunes, and numerous other tools store information in both their own internal libraries, and in the tags section of the audio file. In iTunes case, the tags section of an audio file is the only way to pass that information to iTunes, as the library format is proprietary. I think it would be fine to have a report command, or similar, that simply prints the calculated information to stdout, but I'm of the opinion that writing the calculated data to the audio file is critical functionality for ellington.

Additionally, I imagine that the audio player would want to prompt the user whether the autodetected value is acceptable.

Ellington currently does something very similar to this! It never writes (or overwrites) BPM tags directly, but writes its own metadata to the comment field of the audio file.

virtuald commented 6 years ago

I have to disagree with this. I don't know how (say) Gstreamer does it, but iTunes, and numerous other tools store information in both their own internal libraries, and in the tags section of the audio file. In iTunes case, the tags section of an audio file is the only way to pass that information to iTunes, as the library format is proprietary. I think it would be fine to have a report command, or similar, that simply prints the calculated information to stdout, but I'm of the opinion that writing the calculated data to the audio file is critical functionality for ellington.

I think you misunderstand my point. My point is that if I'm a plugin an audio application, I already have access to writing tags to the file in addition to storing them in the application's library.

Writing audio tag data is critically important to Ellington as a research tool. It is absolutely unnecessary for Ellington integrated with an external audio player, as it just duplicates functionality already done (and presumably hardened and tested) in the hosting audio player.

An imagined audio player plugin + Ellington workflow with the requirements listed above looks something like this:

I agree that designing a single tool to be the experimental platform you want and the easy to integrate tool I want is difficult and there are probably a number of bad ways to do this... and making that a single tool is probably not the right path. However, by making things modular in the right ways, I suspect it's not that bad:

Worst case scenario, an audio player could create a temporary directory, call ellington init, ellington bpm, extract the data, and delete the temporary folder. However, anyone wanting to integrate with ellington would have to do that same work (and it seems like it would be brittle), so it seems to me that it would be good to have a standalone tool that just does it.

AdamBrouwersHarries commented 6 years ago

Hey @virtuald - I've had some more thoughts on this recently, and thrown together a quick oneshot tool implementation. I'd love to hear your thoughts on it.

I think I've come around to your way of thinking, as I've realised that scripting other tools and using ellington within those scripts will likely be more robust.

AdamBrouwersHarries commented 5 years ago

Given that this is fixed with #15, and further with #25, I think it's reasonable to close this, as Ellington now has the functionality that this requires.