snes9xgit / snes9x

Snes9x - Portable Super Nintendo Entertainment System (TM) emulator
http://www.snes9x.com
Other
2.67k stars 462 forks source link

add .ogg support to msu1 #910

Closed Gutawer closed 7 months ago

Gutawer commented 7 months ago

This adds .ogg vorbis support to MSU-1 as an alternative to the .pcm files that currently get used. This style of compression in my testing is about 10:1 - 15:1, i.e. from 27MB for a track to 2MB, which is a pretty big win with a project with many tracks.

This adds the stb_vorbis.c library for the job of parsing the files and getting samples, and supports a LOOPSTART comment in the .ogg files which contains which sample to loop from - this seems to be somewhat of a de facto standard for use cases that need this?

I'm pretty sure this won't currently compile on all platforms as there are some I have no familiarity with unfortunately, I could use some help on this - should just be adding the vorbis.h file to the include list, and the vorbis.cpp file to the compilation list. I've done gtk and qt.

Happy to address any code quality concerns.

qwertymodo commented 7 months ago

Support for compressed file formats is something that has been discussed at great length for over a decade at this point, so please don't think this is a hasty dismissal, it's not. But the answer is no. The .pcm file format is an integral part of the MSU-1 specification, and it is an essential part of making the MSU-1 cross-compatible between emulators and hardware implementation. Compressed audio formats are fundamentally incompatible with that specification. If you absolutely can't afford the disk space, we already support zip compression via the .msu1 archive format, but that's the only option that we're going to support. Sorry.

Gutawer commented 7 months ago

For what it’s worth on the emulator cross-compatibility front, I’ve already made this same type of PR to Mesen2 and was planning to make one to ares today. I can’t change any hardware implementations due to not owning the correct stuff, but as far as I know the only actually existing way that MSU-1 has been implemented on a hardware device is to read the .pcm files in C or C++ just like an emulator would? The approach I’ve used here should be possible in that context too, I just wouldn’t be able to test it.

But fair enough. Could I ask for a summary of the arguments against providing compressed audio inputs? I wasn’t aware that this had ever been discussed before - it’s reasonably hard to google imo - and I thought that the most reasonable roadblock would be emulators not wanting to diverge from other emulators’ support, which is why I worked on this for all 3 major active software SNES emulators at the same time - only issue with my ares implementation is that I haven’t done save state support.

qwertymodo commented 7 months ago

A brief summary of the issues:

-Hardware compatibility. This is a hard dealbreaker. Any changes to the MSU-1 spec/implementation that doesn't work on real hardware (currently that means the SD2SNES/FXPak Pro), is a non-starter. From day 1, the MSU-1 was always designed to be possible on real hardware, and any changes or extensions that can't be implemented on existing hardware will lead to fracturing of the compatibility. The hardware implementation is "just" reading .pcm files SPECIFICALLY because it's a raw uncompressed format, so it's possible to just directly stream raw samples from a file directly into the FPGA. There is no way to do what you're suggesting, because there is no .ogg decoder available in the hardware. You would need to write one in Verilog or VHDL and run it on the FPGA, not in C/C++. -Compatibility fracturing is something we've already dealt with in the past as the result of differing output volume levels. It was an absolute nightmare that took literal years to overcome, long after the actual root problem had been solved. -Lack of any proper standards for looping metadata in existing compressed audio formats. -The MSU-1 spec is a fixed thing at this point. There have been many proposed extensions over the years, such as additional channel support, arbitrary seeking of the audio track, flow control for the data port, etc. but all of them introduce breaking changes that come back to the fracturing issue. It was hard enough getting enough buy-in to make this a reality at all. Unlike "real" hardware that existed in the past, that you can point to and say "this is how it worked", the MSU-1 started as an idea, which means that the temptation to continually change and extend it never reaches a point where you can say "no, that's not how the original thing worked" so starting down that road is a slippery slope. -The MSU-1 was Near's creation, and they were directly involved in exactly these discussions over the years. This is where those discussions landed, and out of respect for them, this is where they remain. I can tell you, it wasn't an easy decision to make. I know personally that Near really wished there was a way to make this happen and avoid all of these issues, but we never figured out a solution, and the ultimate decision was to not allow it.

I'm sorry to have had this conversation after you've already gone and done the work to submit a PR. I hope this doesn't sour you on contributing in the future. This just happens to be a very specific thing with a long history that you weren't aware of. It's unfortunate, but it is what it is.