iLBC 30 to AMR(-WB) sounds robotic

When transcoding between iLBC 30 and AMR, the voice is robotic on the AMR side. Actually, after some seconds, the voice is so bad, it is difficult to understand anything. AMR to iLBC 30 is fine. iLBC 30 is the default mode in Asterisk, for example when you specify allow=ilbc in sip.conf. Actually, only iLBC 30 works, because iLBC 20 is not supported by Asterisk, yet.

This happens because Asterisk is calling our transcoding module with 240 samples (slin8; or 480 samples with slin16/AMR-WB), instead of the expected 160 for AMR (or 320 for AMR-WB). Therefore, the current while-loop creates one correct frame with 160 samples, and one invalid frame with 320 samples (in case of AMR). In case of AMR-WB, one correct frame with 320 samples and one invalid frame with 640 samples are created. The solution would be returning three frames, each with the correct frame size.

Asterisk creates 240 samples, because iLBC 30 has a default packetization time (ptime) of 30 ms. This issue got audible with AMR, because RFC 4867 requires all headers at the start, even when several frames are send within one payload. With other codecs like Speex, the headers/frames are appended after each other. Therefore with AMR, at least ⅓ of the samples got lost, which results in robotic voice. When the receiving AMR implementation checks the data lengths and discards wrong payloads as a whole, even ⅔ of the samples get lost.

Therefore, this issue is not limited to iLBC 30 but applies to all sources which have a different ptime than 20 ms.

Thanks to ~~ASTERISK-25353~~, Asterisk supports several frames as result of frameout. This was the first step to fix this issue. This step was completed with the release of version 13.6.0. To fix this issue completely, the while-loop must enhanced in our AMR module like the built-in GSM module was patched.

traud / asterisk-amr

iLBC 30 to AMR(-WB) sounds robotic #2