home-assistant-libs / voip-utils

Apache License 2.0
0 stars 5 forks source link

Support for non-Opus codecs #3

Open mechanarchy opened 1 year ago

mechanarchy commented 1 year ago

Hey synesthesiam & Hass team,

I am keen to try the voip integration in 2023.5 but my desk phone does not support opus, only older codecs:

Is there any chance you could add support for one of these in the future?

Perhaps G711 as it seems like the lowest-common-denominator for codecs. It is also specified as part of the required codec set for WebRTC compatibility, so I would hope there are existing libraries that can be leveraged.

synesthesiam commented 1 year ago

One worry with other codecs is that they use an 8Khz sample rate. Both the HA cloud and local speech to text options were trained on 16Khz audio, so upsampling may cause significant loss in recognition performance.

Still, it would be worth testing both PCMA and PCMU to see if they are usable.

paravoid commented 1 year ago

I've been struggling to make the voice functionality work with Asterisk (with both the open source and the Digium binary-only codec, both a pain for different reasons), with various warnings around lost frames on the Asterisk side from the opus codec module. While it may be something entirely in my configuration (still very early in debugging it), it would help to be able to try another codec -and one that's more battle-tested with Asterisk- for troubleshooting reasons.

Isn't G.722 wideband/16kHz by design? And given it's uncompressed, wouldn't that be better than Opus, in terms of recognition? I think G.729.1 (note the .1) is also supposed to be wideband, although I have no experience with that.

synesthesiam commented 1 year ago

The Grandstream HT80x doesn't seem to supported wideband for G.722:

a=rtpmap:123 opus/48000/2
a=fmtp:123 maxplaybackrate=16000
a=ptime:20
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:2 G726-32/8000
a=rtpmap:97 iLBC/8000
a=fmtp:97 mode=20
a=rtpmap:9 G722/8000

I would have much preferred the uncompressed audio over dealing with OPUS!

paravoid commented 1 year ago

The Grandstream HT80x doesn't seem to supported wideband for G.722: [...]

a=rtpmap:9 G722/8000

I don't think G.722 8kHz is a thing. I believe what you're seeing is this, from the Wikipedia article on G.722:

G.722 VoIP is typically carried in RTP payload type 9.[6] Note that IANA records the clock rate for type 9 G.722 as 8 kHz (instead of 16 kHz), RFC 3551[7] clarifies that this is due to a historical error and is retained in order to maintain backward compatibility. Consequently, correct implementations represent the value 8,000 where required but encode and decode audio at 16 kHz.

paravoid commented 9 months ago

I guess you've probably moved on from VoIP & Grandstreams in favor of M5Stack/ESPHome/etc., but I was curious if you think this is something that is on the horizon?

The changes don't look super complicated, but given I don't have a baseline with a setup that works, I fear it's hard for me to submit a PR and test this. Is there any other way I can help? Thanks again!

bdsoha commented 9 months ago

I have been juggling a few workaround ideas to the codec issue for some time.

Are there any plans to implement a robust solution without the above?

bdsoha commented 1 month ago

@mechanarchy @paravoid CC: @synesthesiam Any updates on this issue?