hbeni / fgcom-mumble

A (flightsim) radio communication simulation based on mumble
GNU General Public License v3.0
17 stars 5 forks source link

Bug: radio reception is sometimes choppy #72

Closed hbeni closed 3 years ago

hbeni commented 3 years ago

Reported by mill-j:

Identified mumble branch state:

commit 9a4b659f4242e892f5686185e78fd18fcc93f84a
Author: Robert Adam <dev@robert-adam.de>
Date:   Fri Dec 25 18:50:22 2020 +0100

    Fix copy&paste error

FGCom-mumble commit is 4b3ed22


I could not reproduce this here on my box (on mumble commit deb74a66348 "FIX: C&P error in API implementation" with 4b3ed22). maybe this is a side effect of client version mismatch?

(This may have been introduced by mumbles commit c7ebe95030 "Synchronized API calls") may be related to #58 as there are restructuring tips

hbeni commented 3 years ago

mill-j reported that he mostly got it working now. It seems to be a combination of network, mumble-settings and also if you compile mumble and/or the plugin in debug mode.

(debug mode for the plugin is always on with <1.0 versions, so must be manually turned off in the makefile)

hbeni commented 3 years ago

One reason for choppy audio may be that the audio processing is too slow.

@mill-j, you might look at the timestamps in the logfile when receiving audio. You can deduce if the time it takes to process takes longer than allowed (PCM frame spacing / 20ms), probably we need to optimize the code there.

mill-j commented 3 years ago

I'll see what I can find. Just noticed if I start two RadioGUI instances and have only one radio in each then audio is great. If i add three it's worse than two.

hbeni commented 3 years ago

That would also suggest something "too slow" in the audio loop (mumble_onAudioSourceFetched). For each sample, the remote state is inspected to see if we can receive the radio signal. That loop gets longer with each client and also with each radio.

However the contrary: I was able to receive radio from about 100 fake clients (lua bots, but from the perspective of your own plugin, that's not known and they behave like normal clients) without choppy audio...

hbeni commented 3 years ago

Another cause could be: mumble_onAudioSourceFetched aquires a lock to radiofgcom_remotecfg_mtx. That is written to by the intern plugin IO, so when receiving much changing information, this will cause the lock to be not available for the audio callback. Writing the remote state is a fast operation, however it may not be fast enough...

hbeni commented 3 years ago

@mill-j to quickly try out if its a locking issue, just remove the lock statements in the audio callback.

If that helps, we probably need a exclusive locking at least in the writer (and gc thread) where the structure or the data changes (like adding/removing clients or radios). Probably also a good optimization would be to just lock the affected client data, not the entire store.

mill-j commented 3 years ago

Removing the lock fgcom_remotecfg_mtx.lock(); and fgcom_remotecfg_mtx.unlock(); from mumble_onAudioSourceFetched does not seem to make any difference in audio.

hbeni commented 3 years ago

Hm then it’s probably something else. Can you attach a full log (please don’t truncate) of the process when you received the choppiness?

hbeni commented 3 years ago

@mill-j Aside from the log: can you please try to do the following:

If we deduce by this which function introduces it, reenable the other functions to see if it can be isolated this way.

Thanks!

mill-j commented 3 years ago

StutterLog.txt Hope this is what you want. This is a log of me testing two RadioGUI/Mumble instances with two radios each and testing all four PTT buttons and they all had choppy audio.

I had already tested without fgcom_audio_makeMono several days ago. I still haven't got around to testing the rest.

mill-j commented 3 years ago

After disabling the functions in audio.cpp it is no better. I do know the stuttering is the result from low resources: With two RadioGui Instances with 3 radios each. The two mumble clients are each maxing out a cpu core at 100%. Disconnecting only RadioGUI and not doing anything wth mumble will get the cpu uasge down to normal. So somehow we need to optimize for lower end computers.

hbeni commented 3 years ago

This is a very good finding. The question is where recently all this cpu is burned. If you disable all the audio processing, it is not there.

hbeni commented 3 years ago

Hm, we already found out sometime so that the audio receiving callback is too slow. https://github.com/hbeni/fgcom-mumble/blob/88e74b8c34648e4b1a2537dbd767f98276b7bc5e/client/mumble-plugin/fgcom-mumble.cpp#L812

So a probable solution may be to decouple that from the data processing and just let it work on results gained in another thread:

that way we should save alot of processing while retaining nearly instant change updates: the update of the signal cache would only occur when needed (radio state change, location change, frequency, ptt etc), not like now where we calculate that for every voice packet! It will also pay into #58

mill-j commented 3 years ago

Sounds like a great idea. We should probably create an AudioOptimization branch. Testing on low end hardware will be no problem for me ;) My lenovo is fast compared to any of my other machines.

hbeni commented 3 years ago

Hm, i currently try to rewrite that. However the quasi-asynchronous type of the checks open a whole can of worms...

Let's see if i can deal with that.

hbeni commented 3 years ago

Maybe as an alternative - i think you already experimented with the update rate of the incoming packets, didn't you? If you make RadioGUI transmit with a lower Hz frequency, does that ease the stuttering?

https://github.com/hbeni/fgcom-mumble/blob/88e74b8c34648e4b1a2537dbd767f98276b7bc5e/client/radioGUI/src/main/java/hbeni/fgcom_mumble/radioGUI.java#L99 Maybe this helps:

Thread.sleep(1000); // try 1 times per second

This will ease the processing of the UDP server, because he tries to consume packets as fast as possible.

That would be a whole lot easier than the rewrite.

hbeni commented 3 years ago

Aside from the idea with throttling the RadioGUI UDP thread, i found the following based on your log.

FGCom [2021-02-13 19:24:31.022]: [DBG] mumble_onAudioSourceFetched():       volume='1.000000'
[...] slow in between, UPD thread also says stuff [...]
FGCom [2021-02-13 19:24:31.042]: [DBG] mumble_onAudioSourceFetched():       local_radio=0  frequency 121.0000 matches!

The slow is between the following lines, maybe also the radio model calculations are slow too. @mill-j We could see if you drop some more messages into these lines, to see if that really is the case? https://github.com/hbeni/fgcom-mumble/blob/88e74b8c34648e4b1a2537dbd767f98276b7bc5e/client/mumble-plugin/fgcom-mumble.cpp#L884-L916

mill-j commented 3 years ago

I think you are correct, it seems the delay is still right after volume='1.000000'

<D>2021-02-14 19:28:43.202 Is speech
FGCom [2021-02-14 19:28:43.202]: [DBG] mumble_onAudioSourceFetched(): plugin active=1; isSpeech=1
FGCom [2021-02-14 19:28:43.202]: [DBG] mumble_onAudioSourceFetched():   plugin active+speech detected from id=76
FGCom [2021-02-14 19:28:43.202]: [DBG] mumble_onAudioSourceFetched():   sender callsign=Test
FGCom [2021-02-14 19:28:43.202]: [DBG] mumble_onAudioSourceFetched():   sender registered rmt-radios: 3
FGCom [2021-02-14 19:28:43.202]: [DBG] mumble_onAudioSourceFetched():   check remote radio #0
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():    frequency='121.0000'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():    ptt='0'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():    txpwr='10.000000'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():     remote PTT OFF
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():   check remote radio #1
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():    frequency='122.54167'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():    ptt='1'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():    txpwr='10.000000'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():     PTT detected
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():     check local radios for frequency match (local iid=0)
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():     checking local radio #0
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():       frequency='121.0000'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():       operable='1'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():       RDF='0'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():       ptt='0'
FGCom [2021-02-14 19:28:43.203]: [DBG] mumble_onAudioSourceFetched():       volume='1.000000'
FGCom [2021-02-14 19:28:43.219]: [DBG] mumble_onAudioSourceFetched():     nomatch
FGCom [2021-02-14 19:28:43.219]: [DBG] mumble_onAudioSourceFetched():     checking local radio #1
FGCom [2021-02-14 19:28:43.219]: [DBG] mumble_onAudioSourceFetched():       frequency='123.0000'
FGCom [2021-02-14 19:28:43.219]: [DBG] mumble_onAudioSourceFetched():       operable='1'
FGCom [2021-02-14 19:28:43.219]: [DBG] mumble_onAudioSourceFetched():       RDF='0'
FGCom [2021-02-14 19:28:43.219]: [DBG] mumble_onAudioSourceFetched():       ptt='0'
FGCom [2021-02-14 19:28:43.219]: [DBG] mumble_onAudioSourceFetched():       volume='1.000000'
FGCom [2021-02-14 19:28:43.236]: [DBG] mumble_onAudioSourceFetched():     nomatch
FGCom [2021-02-14 19:28:43.236]: [DBG] mumble_onAudioSourceFetched():     checking local radio #2
FGCom [2021-02-14 19:28:43.236]: [DBG] mumble_onAudioSourceFetched():       frequency=''
FGCom [2021-02-14 19:28:43.236]: [DBG] mumble_onAudioSourceFetched():       operable='1'
FGCom [2021-02-14 19:28:43.237]: [DBG] mumble_onAudioSourceFetched():       RDF='0'
FGCom [2021-02-14 19:28:43.237]: [DBG] mumble_onAudioSourceFetched():       ptt='0'
FGCom [2021-02-14 19:28:43.237]: [DBG] mumble_onAudioSourceFetched():       volume='1.000000'
FGCom [2021-02-14 19:28:43.246]: [DBG] mumble_onAudioSourceFetched():       radio models not compatible: lcl_type=STRING; rmt_type=VHF
FGCom [2021-02-14 19:28:43.246]: [DBG] mumble_onAudioSourceFetched():   check remote radio #2
FGCom [2021-02-14 19:28:43.246]: [DBG] mumble_onAudioSourceFetched():    frequency='123.0000'
FGCom [2021-02-14 19:28:43.246]: [DBG] mumble_onAudioSourceFetched():    ptt='0'
FGCom [2021-02-14 19:28:43.246]: [DBG] mumble_onAudioSourceFetched():    txpwr='10.000000'
FGCom [2021-02-14 19:28:43.246]: [DBG] mumble_onAudioSourceFetched():     remote PTT OFF
FGCom [2021-02-14 19:28:43.246]: [DBG] mumble_onAudioSourceFetched():   no connection, bestSignalStrength=-1.000000

Slowing the send rate in RadioGUI had no affect on the audio. It does make the ptt less responsive, however it appears to be much easier on the cpu. We might need to tweek to for optimal balance.

Edit: I added several more debug messages:

float signalMatchFilter; 
fgcom_radiowave_freqConvRes rmt_frq_p = FGCom_radiowaveModel::splitFreqString(rmt.radios[ri].frequency); 
pluginDbg("mumble_onAudioSourceFetched():       Ping 1");
std::unique_ptr<FGCom_radiowaveModel> radio_model_lcl(FGCom_radiowaveModel::selectModel(lcl.radios[lri].frequency)); 
pluginDbg("mumble_onAudioSourceFetched():       Ping 2");
std::unique_ptr<FGCom_radiowaveModel> radio_model_rmt(FGCom_radiowaveModel::selectModel(rmt.radios[ri].frequency)); 
pluginDbg("mumble_onAudioSourceFetched():       Ping 3");

The results show a little closer where the time is lost:

FGCom [2021-02-14 22:39:09.543]: [DBG] mumble_onAudioSourceFetched():       volume='1.000000'
FGCom [2021-02-14 22:39:09.546]: [DBG] mumble_onAudioSourceFetched():       Ping 1
FGCom [2021-02-14 22:39:09.549]: [DBG] mumble_onAudioSourceFetched():       Ping 2
FGCom [2021-02-14 22:39:09.552]: [DBG] mumble_onAudioSourceFetched():       Ping 3
FGCom [2021-02-14 22:39:09.558]: [DBG] mumble_onAudioSourceFetched():       local_radio=0  frequency 121.0000 matches!
hbeni commented 3 years ago

We might need to tweek to for optimal balance.

Just added commit to master, which lets user tweak easily (fgfs protocol can be tweaked at the protocol invocation at fgfs launcher): ae64242114 (in the Options dialog)

EDIT: as it turns out this is quite cool. Tuning the radioGUI down to 1Hz makes the debug log way more readable.

hbeni commented 3 years ago

Hey, @mill-j I finally have a small optimization ready, that is not as invasive as my other two attempts: See branch Issue-72_audioOptimization.

The basic idea was, based on your findings, to optimize around the frequency matching. Because of the internal data structures (and the need to support arbitary strings as "frequency") we need to do alot of string mangling. Because most of the frequencies should be real numbers, the frequency matching is now changed:


With old code it takes 4ms here on my machine to calculate the match:

FGCom [2021-02-15 10:33:27.830]: [DBG] mumble_onAudioSourceFetched():       volume='1.000000'
FGCom [2021-02-15 10:33:27.834]: [DBG] mumble_onAudioSourceFetched():       local_radio=0  frequency 121.0000 matches!

With the new code, it takes 2ms, so this might be a small step forward.

FGCom [2021-02-15 10:35:49.722]: [DBG] mumble_onAudioSourceFetched():       volume='1.000000'
FGCom [2021-02-15 10:35:49.724]: [DBG] mumble_onAudioSourceFetched():       local_radio=0  frequency 121.0000 matches!

Can you try this branch, if it helps? What are the times now at your box? If it's still not enough, you may also give profiling a shot using the gprof tool. Could it also maybe be the case that this is libc specific, that the string implementation in the older libc is slower than my newer libc?

hbeni commented 3 years ago

Oh and while playing with gprof, i found out that because we currently don't compile with compiler optimizations, the code is at least 10 times slower that it has to be.

As a result, i changed the makefile so we have -O3 as default now, and also we don't build debug builds anymore automatically.

I now really wonder:

mill-j commented 3 years ago

Built master with make plugin. Results are perfect. Even with 4 radios in each RadioGUI and send rate 10mhz. The same is true about the Issue-72_audioOptimization branch built with make all-debug. Although the debug version seemed to take more cpu, it did not stutter or chop audio. I also tested with flightgear. It now works with my favorite plane, the new Piper Warrior II! Audio sounds amazing.

hbeni commented 3 years ago

Wow, now THAT are good news!

It turned out it was not a problem with the code per se, but a missing compile optimizing flag. What a difference.

Before we close this, what do you think?

I would vote for „abandon the branch“ because i could not measure a significant gain, but the code is more complicated and harder to maintain.

mill-j commented 3 years ago

I'm fine with master the way it is. I see no performance difference between master and branch.