Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.33k stars 1.97k forks source link

Setting Up Text To Speech Service With Private Endpoint Results With 404 #28549

Closed lironres closed 2 years ago

lironres commented 2 years ago

Hi all, I am trying to deploy a Java application that uses the Speech SDK for TTS conversion, working with a common Azure regions (US West) works great, but switching to use a private endpoint results in HTTP error 404 being returned from the remote Speech Service I've opened in my Azure account.

I'm setting up the SpeechConfig instance using the following pretty straightforward method

private static final String ENDPOINT_TEMPLATE = "wss://%s.cognitiveservices.azure.com";

private SpeechSynthesizer setupSpeechSynthesizer(SpeechServicesProperties.Engine engine, String key, String customDomainName,
                                                 SpeechSynthesisOutputFormat outputFormat) {
    log.info("Creating speech synthesizer for language {} with speaker {}...", engine.getLanguage(), engine.getVoiceName());
    SpeechConfig speechConfig = SpeechConfig.fromHost(URI.create(String.format(ENDPOINT_TEMPLATE, customDomainName)), key);
    speechConfig.setSpeechSynthesisLanguage(engine.getLanguage());
    speechConfig.setSpeechSynthesisVoiceName(engine.getVoiceName());
    speechConfig.setSpeechSynthesisOutputFormat(outputFormat);
    return new SpeechSynthesizer(speechConfig, null);
}

But the convesion fails and the following error log is being printed out.

[600859]: 8380ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:169 SSML sent to TTS cognitive service: שיבוטה [600859]: 8380ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:447 CSpxUspTtsEngineAdapter::UspInitialize: this=0x0000000001491BA0 [600859]: 8380ms SPX_DBG_TRACE_VERBOSE: named_properties.h:364 ISpxNamedProperties::GetStringValue: this=0x00000000014A7C08; name='SPEECH-SubscriptionKey'; value='****' [600859]: 8380ms SPX_DBG_TRACE_VERBOSE: named_properties.h:364 ISpxNamedProperties::GetStringValue: this=0x0000000001491C08; name='AZAC-SDK-PROGRAMMING-LANGUAGE'; value='Java' [600859]: 8380ms SPX_DBG_TRACE_VERBOSE: resource_manager.cpp:95 Created 'CSpxUspCallbackWrapper' as '978711522' [600859]: 8380ms SPX_DBG_TRACE_VERBOSE: named_properties.h:364 ISpxNamedProperties::GetStringValue: this=0x00000000014A7C08; name='SPEECH-Host'; value='wss://XXX.cognitiveservices.azure.com' [600859]: 8380ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:572 CSpxUspTtsEngineAdapter::SetUspEndpoint: Using custom host: wss://XXX.cognitiveservices.azure.com [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: named_properties.h:364 ISpxNamedProperties::GetStringValue: this=0x00000000014A7C08; name='SPEECH-ProxyHostBypass'; value='' [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: resource_manager.cpp:95 Created 'CSpxUspConnection' as '792041254' [600859]: 8381ms SPX_TRACE_INFO: usp_connection.cpp:455 Microsoft::CognitiveServices::Speech::USP::CSpxUspConnection::Connect: entering... [600859]: 8381ms SPX_TRACE_INFO: usp_connection.cpp:472 Adding subscription key headers [600859]: 8381ms SPX_TRACE_INFO: usp_connection.cpp:507 Set a user defined HTTP header 'User-agent':'SpeechSDK-Java/1.20.0 Windows Client 10' [600859]: 8381ms SPX_TRACE_INFO: usp_connection.cpp:513 Set an underlying io option 'tcp_nodelay' [600859]: 8381ms SPX_TRACE_INFO: usp_connection.cpp:522 connectionUrl=wss://XXX.cognitiveservices.azure.com/cognitiveservices/websocket/v1 [600859]: 8381ms SPX_DBG_TRACE_SCOPE_ENTER: web_socket.cpp:165 CSpxWebSocket::CSpxWebSocket [600859]: 8381ms SPX_DBG_TRACE_SCOPE_EXIT: web_socket.cpp:165 CSpxWebSocket::CSpxWebSocket [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: resource_manager.cpp:95 Created 'CSpxWebSocket' as '482598724' [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: named_properties.h:364 ISpxPropertyBagImpl::SetStringValue: this=0x00000000014A7C08; name='SPEECH-ConnectionUrl'; value='wss://XXX.cognitiveservices.azure.com/cognitiveservices/websocket/v1' [933744]: 8381ms SPX_TRACE_INFO: web_socket.cpp:765 CSpxWebSocket::DoWork: open transport. [933744]: 8381ms SPX_TRACE_INFO: web_socket.cpp:508 Start to open websocket. WebSocket: 0x1474f90, wsio handle: 0x143fbe0 [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:340 speech.config {"context":{"system":{"version":"1.20.0","name":"SpeechSDK","build":"Windows-x64"},"os":{"platform":"Windows","name":"Client","version":"10"}}} [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:387 speech.config='{"context":{"system":{"version":"1.20.0","name":"SpeechSDK","build":"Windows-x64"},"os":{"platform":"Windows","name":"Client","version":"10"}}}' [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:387 synthesis.context='{"synthesis":{"audio":{"outputFormat":"raw-8khz-8bit-mono-alaw","metadataOptions":{"visemeEnabled":false,"bookmarkEnabled":false,"wordBoundaryEnabled":false,"sentenceBoundaryEnabled":false}},"language":{"autoDetection":false}}}' [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:372 ssml שיבוטה [600859]: 8381ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:387 ssml='שיבוטה' [933744]: 8457ms SPX_TRACE_INFO: usp_connection.cpp:756 Create requestId for messageType 0 [933744]: 8458ms SPX_DBG_TRACE_SCOPE_ENTER: web_socket.cpp:170 CSpxWebSocket::~CSpxWebSocket [933744]: 8458ms SPX_DBG_TRACE_SCOPE_EXIT: web_socket.cpp:170 CSpxWebSocket::~CSpxWebSocket [933744]: 8721ms SPX_TRACE_ERROR: AZ_LOG_ERROR: uws_client.c:1239 Bad status (404) received in WebSocket Upgrade response [933744]: 8721ms SPX_TRACE_ERROR: trace_message.cpp:207 Error: File:D:\a_work\1\s\external\azure-c-shared-utility\src\uws_client.c Func:on_underlying_io_bytes_received Line:1239 [933744]: 8721ms SPX_TRACE_ERROR: web_socket.cpp:868 WS open operation failed with result=14(WS_OPEN_ERROR_BAD_RESPONSE_STATUS), code=404[0x00000194] [933744]: 8721ms SPX_TRACE_INFO: usp_connection.cpp:902 TS:340, TransportError: connection:0x145ed30, code=7, string=WebSocket upgrade failed: Internal service error (404). Please check request details. [933744]: 8721ms SPX_DBG_TRACE_VERBOSE: usp_tts_engine_adapter.cpp:767 Response: On Error: Code:7, Message: WebSocket upgrade failed: Internal service error (404). Please check request details.. [933744]: 8721ms SPX_DBG_TRACE_VERBOSE: create_object_helpers.h:78 SpxTerm: ptr=0x00000000014EDE48 [933744]: 8721ms SPX_DBG_TRACE_SCOPE_ENTER: usp_connection.cpp:139 Microsoft::CognitiveServices::Speech::USP::CSpxUspConnection::~CSpxUspConnection [933744]: 8721ms SPX_DBG_TRACE_SCOPE_EXIT: usp_connection.cpp:139 Microsoft::CognitiveServices::Speech::USP::CSpxUspConnection::~CSpxUspConnection [600859]: 8721ms SPX_DBG_TRACE_FUNCTION: synthesis_result.cpp:25 CSpxSynthesisResult::CSpxSynthesisResult [600859]: 8721ms SPX_DBG_TRACE_VERBOSE: resource_manager.cpp:95 Created 'CSpxSynthesisResult' as '3874248' [600859]: 8721ms SPX_DBG_TRACE_VERBOSE: named_properties.h:364 ISpxPropertyBagImpl::SetStringValue: this=0x00000000014FC010; name='CancellationDetails_ReasonDetailedText'; value='WebSocket upgrade failed: Internal service error (404). Please check request details. USP state: 2. Received audio size: 0 bytes.' [600859]: 8721ms SPX_DBG_TRACE_VERBOSE: named_properties.h:364 ISpxPropertyBagImpl::SetStringValue: this=0x00000000014FC010; name='CancellationDetails_ReasonDetailedText'; value='WebSocket upgrade failed: Internal service error (404). Please check request details. USP state: 2. Received audio size: 0 bytes.' [600859]: 8721ms SPX_DBG_TRACE_FUNCTION: synthesis_result.cpp:30 CSpxSynthesisResult::~CSpxSynthesisResult [600859]: 8721ms SPX_TRACE_ERROR: usp_tts_engine_adapter.cpp:116 Synthesis cancelled without data received, retrying.

Same thing happens when I explicitly set the following endpoint - wss://XXX.cognitiveservices.azure.com/tts/websocket/v1 - which I took from following this guide, original US West endpoint I took as a standard sample was wss://westus.tts.speech.microsoft.com/cognitiveservices/websocket/v1

Can you kindly direct me on how should I troubleshoot the matter? Are there any configurations I may need to look at in my Azure Speech Service to make this work?

Edward

Setup (please complete the following information if applicable):

OS: Docker Image, based on the official openjdk:11.0.14-slim-buster (Debian 11) deployed in Azure IDE: IntelliJ Library/Libraries: com.microsoft.cognitiveservices.speech:client-sdk:1.20.0

ghost commented 2 years ago

Thank you for your feedback. This has been routed to the support team for assistance.

joshfree commented 2 years ago

/cc @samvaity