Open mmontag opened 2 years ago
Device names are exposed using the SndEmu_GetDevName
call (emu/SoundEmu.h).
For voice names I could add an additional function to the DEV_DEF::rwFuncs
list, I guess.
Do you have any suggestions for how the API call for separate voice buffers should work / look? In the most straight-forward way you either call the "stereo" renderer or the "separate voices" renderer, I assume? (Having a stereo renderer and simultaneously outputting per-voice data seems difficult to me.)
Yeah, choosing one or the other is not ideal now that I think of it. That would force the host to do the stereo mix, which would require panning information and doesn't seem practical.
Maybe you always provide stereo output in the first 2 channels, and the optional per-voice output in subsequent channels. I agree it wouldn't really work to call two separate methods.
Maybe something like:
Buffer layout: [----L----|----R----|--voice1--|--voice2--|...|--voiceN--]
I was trying to figure out how Modizer added multi voice output to vgmplay: https://github.com/yoyofr/modizer/commit/cf117601dd9909b1635b99650df6537f0450b92c#diff-7390462695712dbe8460edcaab01bd6597b9d920f20a15211b0172acc8ceb4df
I disagree with this method, as it would require modifications to the emulation cores, that would have be inconsistently applied and at the very least bypass the emulated chip's mixer. That could cause accuracy issues with the sound chips that perform certain effects during mixing (YM2612, QSound that I can think of on top of my head right now, possibly others).
A few more reasons that I can think of:
As a worst case, you can think of the YMF278B (OPL4), which has up to 45 voices and 6 output channels.
Personally, I'd rather keep this off the library and have the application deal with creating outputs for each channel, either by multiple chip instances or multiplexing (i.e. save state on the "main" instance and replay on a "sub" instance for each voice you want to capture).
Hmm...you raise some good points.
I would suggest, in principle:
I think some enhancement to the cores is defensible if the purpose of the library is music playback, based on user demand. My hope was that libvgm could be a common foundation for players like Modizer or Rymcast. Currently such players do their own hacks to obtain a voice visualization. But yes, it depends on the goals of the library. :)
I think some enhancement to the cores is defensible if the purpose of the library is music playback
The changes you're describing aren't related to playback though. You want music visualization. I don't think that adding bloat deep into the cores will enhance playback in any level.
To further clarify what i think, the individual voice output code will need to be added to each core, adapted to the mixing/channel update routines (which are written in many different ways), and of course as mentioned will increase memory and CPU usage. It will be a burden to maintain, all cores would have to be adapted and new cores/ports from other emulators will be as well.
Also consider the likelihood that it will be misused (ie not for visualization).
I think if this function is absolutely necessary in your application, that it would be best to keep it in a separate branch or fork.
While I'm not completely against adding functions for visualization or separate channel output, I won't put any effort into it anytime soon.
I'm also not convinced about modifying the update/render function to provide additional parameters.
Right now I think that functions that provide the volume / frequency of the channels/voices would be more useful and feasable than additional per-voice output.
Thanks for the discussion here. I think we're in agreement that the best option would be to maintain a fork with voice output support.
My question is related to this, so I thought I would ask here instead of starting a new issue.
I would like to access each voice separately too, but for a different purpose. Rather than visualizing, I'd like to render them to separate files. At this point, even a manual process would do. For example, is there any way to configure some voices to be muted? Particularly with the vgm2wav
tool.
I found this and got it to work: https://github.com/weinerjm/vgm2wav However, it lumps all the SN76489 voices together, and it appears to be a quick thing someone threw together, whereas this project looks much more mature. I played around with VGMPlay
, which appears to be the predecessor to this project, and that appears to have options in the .ini
file to mute channels, but I can't get it to play anything in Linux, and even if I did, I'm trying to save it to a file rather than play it. The included vgm2wav
tool works on Linux, but it doesn't read the .ini
file and has hardly any command line options.
If you want to have a more "programmable" solution, you can look how player.cpp does it here: https://github.com/ValleyBell/libvgm/blob/57713471eef1db49e84f39d4ee3ac83662f01316/player.cpp#L703
If you just want something that works out-of-the-box, then compile vgmplay-libvgm, which uses libvgm. (It uses libvgm internally and includes it as a submodule, so that you can make sure the versions are compatible.)
I was not able to get vgmplay-libvgm
to build, but I was able to modify vgm2wav
to take a --voice
parameter and mute all but the indicated voice using the SetDeviceMuting
example you linked to. It's enough to script out what I need. Thanks!
I have one big feature request for libvgm, and that is improved support for voices/channels in the API (I use the term "voice" to disambiguate from left/right stereo channels).
Device names/voice names
It would be nice for the API to expose the active (1) device/chip names, like YM2608, SegaPCM, etc. and (2) voice names, like FM 1, FM 2, PSG 1, etc. It helps to have friendly names for muting. Game music emu has basic support for this, but it is not implemented for all the FM chips.
Voice buffers (for oscilloscopes, etc.)
In addition to the stereo buffer, I would love to be able to fill discrete voice audio buffers for external mixing or visualization. The host would be responsible for any post-processing like autocorrelation, etc.