FNA-XNA / FAudio

FAudio - Accuracy-focused XAudio reimplementation for open platforms
https://fna-xna.github.io/
Other
546 stars 73 forks source link

[Regression] 20.09 deadlocks Wolfenstein: The Old Blood #215

Closed armageddon51 closed 4 years ago

armageddon51 commented 4 years ago

This new version freeze my wine game (wolfenstein). Observe on 2 pcs (nvidia - amd). Return to version 20.08 solve the issue.

Manjaro XFCE 64 bits.

flibitijibibo commented 4 years ago

Pretty sure these are the only two commits that touched locks:

https://github.com/FNA-XNA/FAudio/commit/0b8f62128c7868c47b60d1a2062d6837b186539a https://github.com/FNA-XNA/FAudio/commit/cea3cddd0fa3bcb0717a89ae936d1e1d61202fb7

CC @aeikum, not sure about the newest game but The New Order is basically just the id5 sound engine, which should be roughly like DOOM 3 BFG's: https://github.com/id-Software/DOOM-3-BFG/tree/master/neo/sound/XAudio2

armageddon51 commented 4 years ago

I forgot to be more precise, it's Wolfenstein the Old Blood (2015). I don't have any other of the series installed.

flibitijibibo commented 4 years ago

@armageddon51 Thanks, should narrow it down a bit. Before Andrew gets a look at this, you will want to run with WINEDEBUG=+xaudio2 so there are logs to work with.

armageddon51 commented 4 years ago

I am really not an expert on wine, so where does the "WINEDEBUG=+xaudio2" goes ? Can it be in the command line ?

"env WINEPREFIX="/home/maurice/.wine" /usr/bin/wine C:\\windows\\command\\start.exe /Unix /run/media/maurice/Tforce-data/Jeux/Wolfenstein\ The\ Old\ Blood/WolfOldBlood_x64.exe"

flibitijibibo commented 4 years ago

Same idea as WINEPREFIX, so set the variable in the same way and it should produce a pretty big log file. You may want to pipe stdout/stderr to a file.

armageddon51 commented 4 years ago

Ok not sure where the log file will be but with that variable set "WINEDEBUG=+xaudio2", that stop the freezing. I have been playing for 10 minutes. I tried without been set and it only took a few minute to freeze. So your are on to something.

I have to brush off my Linux piping !

flibitijibibo commented 4 years ago

That seems to magnify the race condition then. If you have the output file, post it here.

armageddon51 commented 4 years ago

So you mean that I have to give you a log file even if there is no freezing with that variable ON ?

P.S We'll have to delay that testing, it is getting pretty late here. Continuing tomorrow. Thanks for your help !

aeikum commented 4 years ago

I was worried that would happen. FAudio has a ton of different locks so there's probably a lock ordering problem somewhere. I tried to be consistent with the lock/unlock order that I saw elsewhere, but maybe I misunderstood.

armageddon51 commented 4 years ago

Bonjour. can you please enlight me about what you are referring as "locks" ? (my programming skill vanish years ago) With this new update what were they trying to fix ? I can always use the previous version. I tried another game "Rainbow Six Vegas" and it doesn't seems to be affect.

flibitijibibo commented 4 years ago

XAudio2 runs on a thread, and for safety we have to block certain things internally so that two threads don't touch the same thing simultaneously. The problem is that there are dozens of internal pieces that need their own lock, and Old Blood causes a permanent contention between the engine and the audio thread.

We'll probably just need to run it with FAUDIO_LOG_LOCKS enabled to see which 2 locks are being fought over.

flibitijibibo commented 4 years ago

Quick guide for lock logging since it's likely someone else will look at this before me:

cmake .. -DCMAKE_BUILD_TYPE=Debug
make
# Copy your lib to wherever it needs to be
FAUDIO_LOG_LOCKS=1 wine Wolf.exe
# As a possible bonus...
FAUDIO_LOG_LOCKS=1 FAUDIO_LOG_API_CALLS=1 FAUDIO_LOG_FUNC_CALLS=1 wine Wolf.exe
grapeharvest commented 4 years ago

Works if:

  1. Remove lock for FAudio_INTERNAL_ResizeResampleCache (cea3cdd) without touching FAudio_INTERNAL_GenerateOutput (0b8f621)
  2. Return lock for FAudio_INTERNAL_GenerateOutput from (0b8f621) without touching FAudio_INTERNAL_ResizeResampleCache (cea3cdd)
flibitijibibo commented 4 years ago

That seems to make sense - it's likely that a callback is in place while a voice is being created, and the callback depends on the created voice. We may need to replace these commits with a lock on just the decode/resample caches?

grapeharvest commented 4 years ago

You mean revert them (not sure if returning this is necessary) and set a lock just on the ResizeResampleCache ? If so, it avoids stucking, yes.

flibitijibibo commented 4 years ago

@grapeharvest If you're able to reproduce the issue, grab a log with WINEDEBUG=+xaudio2 FAUDIO_LOG_LOCKS=1 wine Wolf.exe so we know which API calls and callbacks are at fault.

aeikum commented 4 years ago

Adding logging consistently "fixes" the issue for me :(

We might have a lock ordering problem between sendLock and sourceLock. The game makes a ton of calls to SetOutputVoices and SetSourceSampleRate, both of which can eventually call ResizeResampleCache. In the case of SetOutputVoices, it first takes the sendLock, and ResizeResampleCache now takes the sourceLock after that. However, during mixing, the sourceLock is held and then the sendLock gets held during FAudio_INTERNAL_MixSource. This could maybe be the source of the deadlock.

I've uploaded a patch that moves the sourceLock before the sendLock in SetOutputVoices: wob_fix.txt It seems to fix Wolfenstein for me. I'd like to try it a few times with Banished, but it might be ready to merge.

Any thoughts?

flibitijibibo commented 4 years ago

Taking the lock out of ResizeResampleCache seems like a really good idea to me - the only part that gets a little scary is that moving the lock out of that function causes the lock time to get pretty big, particularly for SetSourceSampleRate which mostly exists to be a faster way to get a new voice (vs CreateSourceVoice). And if SetOutputVoices is called a lot, that could get scary as well... sourceLock is annoyingly close to a full mixer lock.

What might fix this is moving cache allocation to the mixer - sounds even worse, but: Instead of trying to manage the mixer's memory from outside the mixer, if we could only assign the max decode/resample lengths inside the source/submix structs (which we pretty much already do with voice->src.decodeSamples and voice->src.resampleSamples/voice->mix.outputSamples), then realloc at the top of MixSource and MixSubmix if the buffer isn't large enough, it means the caches are never fought over, removing the need to have locks of any kind for those pointers, in exchange for a couple extra ifs in the mixer. We might need to be careful about changing those variables mid-mix, but that's way easier to keep track of than sourceLock for example.

armageddon51 commented 4 years ago

If I may ask a simple question, since the last version (20.08) was working perfectly , what was this new one trying to fix ?

flibitijibibo commented 4 years ago

Should be fixed by #218.