Open jackjansen opened 8 months ago
If it turns out that #144 has been fixed we should also check these points again. Crossing finger that they'll also be fixed.
Confirmed fixed.
Re-opened this. The second and third bullet point were indeed fixed, but the first one not yet.
There were two things that we tested this and last week:
fiddlehead
and his laptop
. Two scenarios were tested with Direct TCP
and Socket I/O
, with simple_avatar
, Without_HMD
. A simple headphone was connected to fiddlehead
and in-bulit microphone/speaker was used for the laptop. Socket_IO
and Direct_TCP
, the difference between actual and reported latency was approx. 200 ms. reported_latency
was approx for Socket_IO
and Direct_TCP
was 150ms
and the Actual_Latency
was approx. 370ms.
Please note that the latency for Direct_TCP
was 15 ms
less than the Socket_IO
Jack and Ashu tested on MAC
to verify if this additional latency could be a Windows_Issue
.
Direct TCP
scenario was tested with simple_avatar, Without_HMDs, simple headphone and in-bulit microphone/speaker was used for the laptop.reported_latency
was approx for Direct_TCP
was 150ms
and the Actual_Latency
was approx. 360ms
. For reference: @jvdrhoof is also seeing some unexpected latencies, sometimes, with a completely different setup: point clouds over socketio or webrtc. Sometimes (but not always) he is getting latencies in the 500ms
range, with all machines (both clients and the orchestrator) on local networks.
This may be a red herring, but it may mean that we are looking at a problem that has nothing to do with audio but is actually in a completely different place.
(and, completely beside the point: @ashutosh3308 please reserve backticks print("this is code")
for code and verbatim things like numbers 3.0
or 2 ms
and host names flauwte.local
and such. Use italics with underscores for emphasis, or if you want something stand out. And use Strong emphasis with double stars only very sparingly.
On the branch, repeated flauwte
to vrtiny
measurement, TCP transport.
Measured latency (clap plus echo) was 370ms
. Except once, where it was about 600ms
.
Reported latency on vrtiny
was 120-180ms
.
Vrtiny time was -17ms
uncertainty 15ms
.
Flauwte time was -41ms
uncertainty 47ms
Repeated the exact same experiment, with very similar outcomes.
Did another one today. Same setup. Delay was variable, and noticed that as measured latency went up reported latency also went up, and vice versa.
It almost seems as if the measured latency is approximately twice the reported latency. But this seems ridiculous....
Another observation: on the receiver side, if we compare latency_ms
to voice receiver_queue_ms
the difference is almost always 60ms
.
On the sender side, in AsyncVoiceReader
, we have record_latency_ms=28ms
and output_queue_ms=30ms
.
Assuming a millisecond or so for transmission all the reported numbers seem to match: record_latency+output_queue+transmission+receiver_queue == total_latency
Disabled the spatializer. Makes no difference: still about 200ms extra latency perceived versus measured.
But now it seems that the "DSP Buffer Size" we set in Project Settings -> Audio sticks. And it was set to "best performance" previously.
I have set it to "Best Latency". Difference between perceived and measured latency goes down to 140ms
.
dspBufferSize
is now 256
, and it was 960
previously. So 700 samples difference, so at 48000 sample frequency that is about 15ms
.
So this does not explain the 60ms decrease in perceived latency. And it doesn't explain where the other 140ms go.
Need to test Mac-to-Mac, to ensure were not also looking at some Windows problem.
Experimented with audioFps=100
, in other words: a single packet is now 10ms
in stead of 20ms
.
Perceived latency goes down to 180ms
. Measured latency goes down to 60ms
. So measured is a bit closer to perceived, but not all that much (still 120ms
difference).
Go the other way, audioFps=10
. This should greatly increase the latency, but that latency difference should show up in the reported latency.
Record latency=60
, output queue=100-160
, receiver queue=200
, latency=300-350
. So those numbers seem to match, somewhat.
Perceived latency is around 500
. so we are still looking at a discrepancy of around 150ms.
Repeated at home, Mac->Windows (sap
to beelzebub
).
Measured latency 800ms
, reported latency around 500ms
. So a difference of 300ms
, much higher than yesterday.
I did notice that both machine had dspBufferSize=1024
.
Set both machines to Project Settings -> Audio -> DSP Buffer Size -> Best latency. Make very little difference.
Set both machine to mono, no spatialised, no virtual effects, no audio suspension. Makes very little difference.
Next experiment: Mac to Mac (using beignet
as the receiver).
Reported latency almost rock-solid at 510ms
, almost exactly the sum of the components.
Measured latency 660ms
.
Unfortunately all these experiments show is that various theories we had are all wrong:
Tried with a built player (Mac->Mac). Again doesn't make a lot of difference: reported 350ms
, measured 430ms
.
So there is a lower discrepancy between reported and measured, 80ms
, but it's not a difference to write home about.
Tried with a built player on Windows too (beelzebub
).
Reported latency 135ms
, measured latency 400ms
. Discrepancy 165ms
.
Tried a different way of measuring: don't do an audio-only capture but record a video. On the voice receiver machine select the VoicePipelineOther
so we can see its VU-meter in the inspector.
Load the video into Logic, so we can examine the waveform and the video with the VU meter at the same time.
The VU meter is pretty much frame accurate...
So something in Unity knows when it is playing out the audio......
Added a "DIY VU meter", by computing the RMS in the buffer. Recorded it with a slo-mo camera. Sometimes it gives the similar readings to the Unity VU meter, sometimes not?
The DIY VU meter improved (sort-of) by using GetSpectrumData
. Now it gives the same readings as the Unity VU meter.
But accidentally found something very interesting, after a session was running for a long time: even with both input queue sizes set to 2 packets the reported (and perceived latency go up and up and up......
Seems like packets re being held up somewhere in the network or in AsyncTCPReceiver?
That was a red herring. At least, it wasn't related to what we're looking for. The queue lengths were long, and the first queue was non-dropping. That may be something we want to address, but it's not related to the extra 100-200ms latency we perceive.
Decided to go back to Unity 2021, just to see whether maybe this problem has been present for a long time but we just didn't notice it.
First try was branch ancient/deplyment/pilot0
, but this doesn't work anymore: it requires the scenario storage at the orchestrator, which isn't there anymore.
Tried with newly created branch ancient/deployment/unity2021
, which is the situation of mid-february, just before the switch to the new orchestrator and Unity 2022.
This branch has exactly the same problem as the current situation: the perceived audio latency is about 200ms
larger than the reported audio latency.
I'm now starting to think that we never actually measured audio latency, and alway just looked at the reported number.
@ireneviola also isn't sure we ever really measured this.
Ok. We know where the problem is (on the Unity playout side), and we know how to work around it (by adjusting the audio playout clock to be slightly ahead of where we actually want it).
But I'm not going to fix it now. All the groundwork is in place, and the structure of the voice pipeline and P_Player
and P_Player_self
are ready for it, but I'm going to merge now and fix later.
Fast-forwarded branch to current state of master (v1.0.2).
Merging this branch into master, so @ashutosh3308 can use it for his VQEG experiments. Not losing the issue yet because it isn't resolved fully.
We're seeing all sorts of different issues with voice audio:
latency_ms
in the stats: output...We need to investigate what is going on.
And let's use this issue to document whatever we find.