Closed xandark closed 1 year ago
Try this branch: https://github.com/audetto/AppleWin/tree/audio2 and see if the buffer size in the audio settings helps.
I think there is something wrong in the whole emulation, but the adaptive nature of AppleWin audio generation makes it extremely hard to reason about.
Going in and out of enhanced speed has audio effects too.
If the slider does not work, try as well to change the value in source code: https://github.com/audetto/AppleWin/blob/d9fc4faac456b8d775179695165555ab0be108a3/source/frontends/sdl/sdirectsound.cpp#L48
And try as well to see which of these 2 lines works better, or they should both stay.
I rebased it. See if these controls help at all:
trigger
and target
each time the sdl buffer is below trigger
ms, it will ensure it is target
ms.
Okay, I've managed to build the audio2 branch and test sa2, which is pushing my git skills.
I have played with the sliders for trigger and target size, and I don't notice much of a difference, unless I set trigger very low or to zero. Then the audio begins to repeat, which makes sense.
When I set trigger to 400ms and target to 400ms, for example, I don't think I'm hearing much latency difference. I've played with different values for about 10 minutes, and it's hard for me to say, yes, I found the winning combo.
I'm using Aztec as my test. In the original Windows-based AppleWin, when I press W to walk the character and S to stop, the sound of the footsteps starts and stops right when you'd expect it. However with this Linux port, the walking sound starts about .5 - 1.0 seconds after the walking animation starts or stops.
In the Dear ImGui interface, which is really great btw, I see the new controls. Below them, there are columns which say Direct, SDL, and Total. I see that there is a normalized value for each of those columns for row Channel 1. What do the values mean? I'm trying to understand what I should be trying to optimize toward. Should I be looking for a low, stable Total value?
Direct, SDL and Total are in ms.
This is the amount of audio in 1) the AppleWin buffer 2) in the SDL queue 3) the total of the 2. This should be the latency.
Problem is that they are not stable and do not behave in a predictable way. AppleWin tries to autodetect how much is good and when to produce more or less data and I think this goes horribly wrong.
Take a look at the audio-related comments in this maintained repository (not the same as apple2 on Github): git clone http://shamusworld.gotdns.org/git/apple2
In this codebase, the audio is running as a separate thread and the author includes a number of comments related to synchronization that suggests there may be something thread- or timing-related going on in SDL. I'm not 100% sure it's fully working in that codebase, but if there's a problem, it's not perceptible like it is in AppleWin.
If you're compiling on Pi, you may need to change the audio setting to desired.format = AUDIO_S16SYS;
in audio.cpp ~line 72 for audio to work. Hope this provides some insight... the codebase is relatively small & readable.
I can try to use the callback and see if it helps.
But
I've compiled it and run on my Ubuntu Desktop:
as soon as it starts, any existing audio is negatively affected. This does not happen with my code.
"existing audio" ... meaning audio playing from other software? That may be a choice of some of the initial audio configuration parameters. Lmk if you've made the above desired.format change and what specific problems you're having and I'll see if there's a simple fix.
I don't think it's an example of perfect audio fidelity, rather better synchronization between the audio channel and the emulator (I've seen audio problems of one kind or another in most A2 emulators). The AW codebase is pretty large and not something I'm in regularly enough to be able to point to direct relevancy, but between the code and what the author documented I hoped it might at least provide a seed of inspiration.
AUDIO_S16SYS
works better, probably not a Pi-specific fix.
@audetto I recently switched the macOS port to use CoreAudio, and I get a noticeable (but I didn't quite measure) latency improvement. The main difference is that CoreAudio pulls audio data on a real-time thread when it wants (at which point I feed it whatever is in the DirectSoundGenerator
) as opposed to the SDL_QueueAudio
call. As far as I can tell, your SDL-based sdirectsound
is not substantially slower at calling SDL_QueueAudio
than the CoreAudio callback version, so the delay might be due to what happens inside SDL_QueueAudio
.
I used the Mockingboard DEMO-Dual Sound Generators.dsk
simple piano demo to test audio latency.
I start thinking that this is the problem. When I push in SDL I have no really idea how much to push (as opposed to QT). This interacts badly with AppleWin which will try to compensate if I get it wrong going faster or slower.
Did you need to lock? Can you share the link to the audio callback.
Did you need to lock?
Uhhh… 😳 yes, I would need to lock, probably among Lock
, Unlock
, and Read
in dsound
. Is that something you want to do upstream or should I just do it for macOS? (Nice catch, thanks!)
The code is in https://github.com/sh95014/AppleWin/blob/master/source/frontends/sdl/sdirectsound.cpp, specifically DirectSoundRenderProc
.
This should do it: lock.diff.zip
Let me know if you want to incorporate in your tree or whether I should apply it only to mine. Thanks!
Thank you. I will probably incorporate, but this will have to wait.
I will only be back at the end of August...
Sorry this has nothing to do with the linux port. I've been working on your libretro core. I found the audio sync issue is easier on that because once we manage to generate number of samples regularly, say 735 per frame, the audio is sync'd internally in the RetroArch emulation code. The point is that the AW speaker code generates samples in a throttling manner, which must be disabled. After that the code works very well. I also added floppy drive sounds and got the mockingboard working on it.
The point is that the AW speaker code generates samples in a throttling manner, which must be disabled. After that the code works very well.
Yes, that makes everything more complicated.
I also added floppy drive sounds and got the mockingboard working on it.
If you upstream to AppleWin, it will be available everywhere.
Resolving the audio latency issues for AppleWin on Linux would be a major improvement. Just checking where this is currently at... have improvements been upstreamed to AppleWin? It sounds like @redenvelope2000 has some major enhancements, hope they get incorporated.
What I added is a regulator between myBuffer->Read() and mixBuffer() such that no matter how many samples the AW speaker code generates last frame, they are resized to 735 samples so there can always be the same number 735 samples sending to the mixBuffer(). I wanted to change the AW speaker code to have the same effect, but till now the working code are still in the frontends folder. During debugging I'd seen the speaker code generates "a little more than expected" samples which convinced me if a regulator was not there the R/W pointers of the audio buffer would eventually overwrite each other after a period of time. It is a mystery to me how the Windows build handles that.
Answer to myself.
The audio R/W pointers in the Windows build do not overwrite each other. The W pointer of the audio buffer is initialized to the 3/8 position in the audio buffer, which gives a 0.1392s initial latency. The Spkr_SubmitWaveBuffer() code tries hard to keep the distance of R/W pointers between 1/4 and 1/2 of the buffer, that is, between 0.09s...0.18s by adjusting the frequency of the simulated 6502 CPU using g_nCpuCyclesFeedback. This global variable can be set to +-20 g_fClksPerSpkrSample to have the CPU to generate 20 more or less than usual samples to correct the error. I modified the Windows build to run by 1/60s frames to observe the behavior. For normal frames, 739 or 740 samples were produced, remembered we need 735 per frame? That's 4 or 5 more than needed. The correction occurred around 13 times in one whole second, 20 13=260 samples were reduced from usual audio generation. Equivalently, 260/60=4.333 samples were removed from each frame. The conclusion is that over-generation of samples does happen and the Windows build can correct that.
The error actually comes from the setting of g_fClksPerSpkrSample in SetClksPerSpkrSample(). It is 23, an integral number but it really should be 23.14. It is said in the comment of SetClksPerSpkrSample() that this CPU clocks per audio sample value was rounded for better sounds. However it also brings the error-- 0.14/23.14*44100/60 = 4.446. 4.446 more audio samples have been produced for every video frame for every AppleWin builds. Because it is well within the correction capability nobody have complained that so far. Though our SDL build does not use g_nCpuCyclesFeedback, I don't think simply adding it back can fix it because the correction mechanism of the Spkr_SubmitWaveBuffer() depends on lpDSBvoice->GetCurrentPosition() to return the precise number of consumed samples which the SDL_QueueAudio() does not.
Modifying SetClksPerSpkrSample() to accurately initialize g_fClksPerSpkrSample without rounding should give some improvement. It could be even better to implement a customized ring buffer by using SDLAudioCallback() as specified in https://davidgow.net/handmadepenguin/ch8.html. This fits the working model of the Windows direct sound code much better and makes the g_nCpuCyclesFeedback meaningful.
@redenvelope2000 this looks like a highly detailed and actionable analysis. I definitely understand why @audetto would want this considered by the core project as the audio latency issue is pretty severe in SDL. Maybe a first effort would be submitting this analysis as inline comments within the core AppleWin repository for review by @tomcw @sicklittlemonkey and others. At the very least it seems like the underlying audio subsystem could use some demystifying / documentation... and maybe a project advocate for resolving this to facilitate better cross-platform support.
I have done some work on the audio generation and results look promising.
https://github.com/audetto/AppleWin/tree/audio_callback
Major changes are
--sdl-audio-buffer
to set the SDL buffer size (in ms, default = 46)g_nCpuCyclesFeedback
when running in --fixed-speed
In the settings tab there is
the current value of the AW audio buffer (more reliable than before) and the number of underruns.
@sh95014 I really would like to have your opinion on these changes.
And on the best way to apply g_nCpuCyclesFeedback
, which is currently applied once per frame
but I wonder if it should go here instead
Moreover, the maximum adjustment allowed by default is 200 samples. AppleWin runs in chunks of 44 samples (1 ms), so 200 is a lot of freedom. Here I run 735 samples (60FPS), so the freedom is less, but still almost 30% of the speed. I am not sure how to test all of this.
(@audetto, sorry for the slow response, was on vacation.)
Unfortunately I don't have anything all that intelligent to add, but your audio_callback
branch looks about what I'd expect. I don't fully understand g_nCpuCyclesFeedback
, but it would seem like you'd want it in common2
than in sdl
specifically? (Doesn't really matter to macOS in the short term because despite its name I'm still using sdlframe.cpp
anyway.)
The error actually comes from the setting of g_fClksPerSpkrSample in SetClksPerSpkrSample(). It is 23, an integral number but it really should be 23.14. It is said in the comment of SetClksPerSpkrSample() that this CPU clocks per audio sample value was
You are right. This is a symptom of the problem and I think the solution is to do like the Windows version. Moreover, AppleWin (Windows) does not run at NTSC or PAL speed, but effectively at 23 * 44100 Hz, because as you say, the feedback is constantly slowing down the emulator. It took me ages to unravel this.
In Linux I was trying to be hit the exact speed, but this is bad for audio. I will soon push to https://github.com/audetto/AppleWin/tree/audio_callback a final fix to handle the feedback.
I think that if one targets 23 * 44100 directly, the need for a correction is vastly removed, although it is still useful to compensate smaller / random issues.
Look at https://github.com/audetto/AppleWin/pull/87
You can try to further reduce the SDL audio buffer
--audio-buffer 46
is the default.
Okay very good. I've recompiled the latest code and I find that running on the SDL-based sa2 has much lower audio latency when playing Aztec. However, qapple has a noticable audio latency when I stop the game character from walking. I also notice that qapple spams a lot of debug output to the console:
apple.audio: Restarting the AudioGenerator
apple.audio: AudioOutput: size = 11025 f, period = 882 f
apple.audio: Written some silence: frames = 8820 , duration = 200 ms
apple.audio: Restarting the AudioGenerator
apple.audio: AudioOutput: size = 11025 f, period = 882 f
apple.audio: Written some silence: frames = 8820 , duration = 200 ms
apple.audio: Stopping with silence: frames = 5692 , duration = 129 ms
apple.audio: Stopping with silence: frames = 10093 , duration = 228 ms
Could this be disabled?
I notice that qapple and sa2 have 200ms default audio latency values, which makes playing games not very satisfying. I've been watching this project for awhile and it was only recently that I tried changing the audio latency to 0 in qapple and have finally found that to work for me.
On the other hand, it's not obvious from the command line args to sa2 how to change the latency to 0ms.
Could both of these be set to 0 by default, at least on Linux platforms?