TTS: Excessive silence at the end of audio generated using gu-IN-DhwaniNeural voice

Describe the bug Audios generated for gu-IN locale using voice gu-IN-DhwaniNeural contains about 3 sec silence at the end of audio file. The same generation, performed using gu-IN-NiranjanNeural voice, produced a normal file without long silence (see attached samples and screenshot).

Here is a length difference between gu-IN-NiranjanNeural voice (shorter) and gu-IN-DhwaniNeural voice (longer) on the same text above:

Audio files generated: gu-audios.zip

To Reproduce Use next SSML for audio generation:

<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="https://www.w3.org/2001/mstts"
 version="1.0" xml:lang="gu-IN">
  <voice name="gu-IN-DhwaniNeural">
    <mstts:silence type="Leading-exact" value="0ms"/>ઉનાળો મારી પ્રિય મોસમ છે.<mstts:silence type="Tailing-exact" value="0ms"/>
  </voice>
</speak>

Expected behavior gu-IN-DhwaniNeural voice should generate audio without a long (~3sec) silence at the end for the SSML with <mstts:silence type="Tailing-exact" value="0ms"/>

Version of the Cognitive Services Speech SDK Java SDK 1.36.0

Platform, Operating System, and Programming Language

OS: amazonlinux:2
Programming language: Java

Azure-Samples / cognitive-services-speech-sdk

TTS: Excessive silence at the end of audio generated using gu-IN-DhwaniNeural voice #2510