flibitijibibo / flibitBounties

Pile of programming bounties for things flibit can't do right now
27 stars 0 forks source link

SIMD-ify FAudio resampler/volume functions #5

Closed flibitijibibo closed 6 years ago

flibitijibibo commented 6 years ago

Introductory Information:

XAudio2 is Microsoft's low-level audio library made primarily for game developers. XAudio2 is split up in to multiple parts, including XAudio2 itself (confusingly), X3DAudio, and XAPOFX.

FAudio is FNA's upcoming replacement for the Audio subsystem. It was originally designed as just an XACT reimplementation, but was eventually expanded to act as an DirectX Audio implementation with support for XAudio2, X3DAudio, and XACT3. The repository is currently here.

The Project:

Right now most of FAudio's processing work is done with scalar processes. This isn't terrible but for low-end machines this gets really bad really fast (especially with ARM CPUs). We have four really bad spots:

  1. The linear resampler
  2. Volume application (Part One, Part Two)
  3. Output mixing (Part One, Part Two)
  4. MSADPCM decoding

You'll be focusing on bullets 1 and 2. 3 isn't that bad for stereo setups and 4 is just too complicated to be SIMD-ifying right now (and all the other decoders are generally fine). I say "SIMD-ify" because we need both SSE and NEON versions, but if we have at least one of these, the other isn't that hard to fill in... so whichever one you're good with, I'll take.

Volume application should be reasonably straightforward, but note that the resampler is working with interleaved data and can have up to 8 channels(!). If it's easier, we can just make optimized paths for 1/2-channel audio (the most common cases by far) and just leave 3+ channel resampling alone.

The one ugly part to all this is that the memory is NOT guaranteed to be aligned, so you'll have to do some scalar busy work at the beginning and end of the routines, similar to what our PCM converters do.

Prerequisites:

Basic knowledge of either SSE or NEON is required. The algorithms are extremely simple but SIMD is no laughing matter!

Example Games:

Anything using FAudio should be a good case, but games with MSADPCM WaveBanks will be good to test since XACT provides some absolutely absurd sample rates for MSADPCM entries. Capsized, Escape Goat 1/2, Rogue Legacy, Dust: AET, Blueberry Garden, Skulls of the Shogun, TowerFall Ascension, Apotheon, Bleed 1/2, Rex Rocket, Wyv and Keep, Cryptark, Murder Miners, Shuggy, Salt & Sanctuary, Owlboy, Charlie Murder, and The Dishwasher: Vampire Smile are my own examples, and the XACT data should work fine in FACTTool.

How Much Can flibit Help?

If you're unsure about how to structure the SSE/NEON/Scalar versions of the functions I can help with that, but odds are if you've worked with SSE/NEON before you don't really need my help...

Budget/Timeline:

Measuring in weekends, I expect this to take about 1 weekend. The algorithms are pretty dirt simple and the volume stuff in particular is about the closest thing to busy work I've put on this page. I currently have $500 USD allocated for this project.

flibitijibibo commented 6 years ago

In case anyone's feeling ambitious, I ended up drafting possible optimizations for bullet 3 which can be SIMD-ified (other than the Generic function, probably):

https://gist.github.com/flibitijibibo/30e07cc8d1923ab041b01347bb99ca5f

flibitijibibo commented 6 years ago

This is being currently being worked on. Pull request should show up in a bit.

flibitijibibo commented 6 years ago

Pull Request: https://github.com/FNA-XNA/FAudio/pull/25

flibitijibibo commented 6 years ago

This is now complete! Maybe some day we'll do this again for mixing.