Placeholder-Software / Dissonance

Unity Voice Chat Asset
70 stars 5 forks source link

The playback might fall behind, should an unintended application suspension occur (HoloLens) #90

Closed bockelie93 closed 6 years ago

bockelie93 commented 6 years ago

Context

The HoloLens sometimes loses tracking of the room (i.e. look into a plain wall while moving), which pauses the application until the HoloLens understands the room again. When this occurs, the playback is not played, but the application keeps receiving VoicePackets. The result of this, is that VoicePacket heap in EncodedAudioBuffer builds up and you do get the "Encoded audio heap is getting very large" error print.

When the application resumes, it starts playing from the playback, which can be several seconds old. Also, if it keeps receiving new VoicePackets, then it will never catch up to live.

I suppose it's the same issue as in #87 when they paused the editor. (I have changed it to unreliable).

Expected Behavior

I would expect the playback to detect that it is lagging behind, drop the old VoicePackets (i.e. 1-2 second old) and catch up.

Actual Behavior

When the playback falls behind, it keeps playing from the back of the heap, which can be several seconds old.

Your Environment

martindevans commented 6 years ago

I didn't know that the hololens completely pauses the application when tracking is lost - that's rather unfortunate (with respect to voice)!

Dissonance actually has two systems for catching up when it's lagging behind. If you're a small amount behind (or ahead) it will speed up (and slow down) audio playback by a few percent to bring them back into sync. Once it detects a very large desync (more than 1 second) it will throw away as much audio as it thinks it needs to catch up immediately.

You can observe this in action by finding the playback object for the speaker (a child of the DissonanceComms gameobject) and inspecting the SamplePlaybackComponent. It will show:

Buffered Packets: (number of packets that have arrived but not yet been played)
Playback Position: (how many seconds of audio have been played)
Ideal Position: (where playback should be, i.e. wall clock time that has elapsed since playback began)
Desync: (different between actual and ideal playback)
Compensated Playback Speed: (speed of playback, should be above one when audio is lagging behind)

I would guess you're probably falling between the two systems. If you're desynchronised by almost a second it will take about 20 seconds to bring you back into sync (by increasing pitch). If you're desynchronised by a second it should bring you back into sync immediately (by discarding some audio).

Is this what you observe? If not could you post some summaries of the SamplePlaybackComponent inspector when it's desynced but not compensating as expected?

bockelie93 commented 6 years ago

I replicated it in the editor. It's exactly like what I experience on the HoloLens when it loses tracking.

  1. Create a session between two units (one being the editor), test that they can hear each other.
  2. Pause the editor.
  3. Talk into the non-editor Microphone for a while (20-30 seconds).
  4. Unpause the editor.
  5. Listen to editor and hear everything you said during the paused time.
  6. Try talking. Notice how it isn't played right away, but instead after a delay (equal to the paused time).

Here's the sample playback component inspector: https://imgur.com/a/B7HAY First pictures are from right after I click unpause. The later ones are from after the paused audio has finished playing, but has yet to catch up to live.

martindevans commented 6 years ago

Wow, 226 buffered packets - that's definitely not right! Unfortunately I can't reproduce this myself (testing with the PUN integration demo scene).

I do have a suspicion what the problem might be. The sync system fundamentally works by recording two things: (1) wall clock time elapsed since playback began, (2) number of samples played since playback began. As these two values drift the sync system speeds up and slows down playback to ensure they stay close together.

If you pause while playback is playing this should work - the wall clock time continues to tick up but samples played count does not. Once the game is unpaused there will be a large desync detected and it will probably print a warning like this:

Playback desync (NNNNms) beyond recoverable threshold; resetting stream to current time

Have you seen warnings like that?

However, if you pause while no playback is happening and then start receiving a new speech stream while still paused this won't work because the wall clock timer wasn't started yet when you paused and so the large desync is never detected.

Does this sound like it matches up with your experiences? i.e. some sessions self correct after a tracking pause, others will remain desynced?

bockelie93 commented 6 years ago

It acts the same if the playback is playing or not when pausing.

Playback position (which I assume is samples played count?) and ideal position kept the same value (both were counting up) if the playback was playing when I paused. The 'Playback desync (NNNNms) beyond recoverable threshold; resetting stream to current time' warning did not show.

One thing I noticed about buffered packets is that they seem to come in batches?

  1. Count to 20
  2. Pause the application between 5 and 15
  3. Resume the application again
  4. Once you reach 20, wait for 1-2 seconds
  5. Continue counting to 30.

Until you reach 5, the buffered packets will stay around 1-2. When you pause at 5 all the way to 15, the buffered packets will stay at 0. When you unpause at 15, the buffered packets will rise to 80-120 (depending on how fast you count). When the playback plays the number 20, the buffered packets will eventually be 0 (it has counted down from 80-120). After 1-2 seconds, when you continue counting to 30, the buffered packets rise to 30-50.

I suppose the problem could be that they are buffered up somewhere else than in the playback? Atleast considering that I suddenly get 30-50 packets when I continue counting in step 5?

bockelie93 commented 6 years ago

I should add that I use UNET HLAPI integration of Dissonance, and that I am recording using a HoloLens.

martindevans commented 6 years ago

Ok I can reproduce this with HLAPI and pausing the editor, I guess PUN may have been silently handling it (e.g. discarding unreliable packets while paused).

I did observe the Playback position (count of samples played) and ideal position (time elapsed since playback began) ticking up even while paused. This must mean that the audio system isn't completely frozen even during the pause - there's no other way the samples played count could keep ticking up.

I think the problem may be in the HLAPI itself. While the editor is paused it seems that the HLAPI receives the packets and just buffers them up, once it's unpaused it dumps all the packets on the playback system all at once. This of course causes a huge backlog of audio which Dissonance then plays. There are also secondary effects because while the editor is paused Dissonance detects 100% packet loss, it increases the size of it's jitter buffer to tolerate what it thinks are dreadful network conditions and this further increases the delay.

I've been experimenting with a potential workaround for part of the problem - just before playback begins it compares the intended size of the jitter buffer with the actual amount of audio in the buffer and if there's far too much audio it discards some. I'll continue testing this.

martindevans commented 6 years ago

Dissonance 6.2.0 has just been submitted to the asset store, this includes the jitter buffer self check fix. This should be available in a few days once the asset store team reviews it :)

martindevans commented 6 years ago

Dissonance 6.2.0 is now available on the asset store so I'll close this issue now. Don't hesitate to re-open it if there's still a problem :)

martindevans commented 6 years ago

I was just discussing an unrelated Hololens issue with another developer. Apparently you can prevent the application from being briefly suspended when tracking loss occurs by checking "Run in background" which should be checked anyway for Dissonance.