[speech-to-text] websockets - Getting 500 Server Error while trying to recognize using custom model and websockets

shikida commented 7 years ago

I've just created the custom model, added corpus and trained (400 words in the model). Model is available. Trying to recognize 2 minutes of audio from a FLAC file. Job worked with Watson default (not custom) model and also using the regular request (not using websockets).

Maybe Watson custom model is slower than the default one and the latency may be affecting the websockets timeouts?

So

this works

        SpeechToText service = new SpeechToText();
        service.setUsernameAndPassword("...", "...");

        RecognizeOptions options = new RecognizeOptions.Builder().continuous(true).timestamps(true)
                .interimResults(true).contentType(HttpMediaType.AUDIO_FLAC).customizationId("19b28680-7452-11e7-a2b2-17481cb0c9b8")
                .build();

        SpeechResults rs = service.recognize(new File("..."), options).execute();
        System.out.println(rs);

and this does not work

    SpeechToText service = new SpeechToText();
    service.setUsernameAndPassword("...", "...");

    InputStream audio = new FileInputStream(
            new File("..."));

    RecognizeOptions options = new RecognizeOptions.Builder().continuous(true).timestamps(true)
            .interimResults(true).contentType(HttpMediaType.AUDIO_FLAC).customizationId("19b28680-7452-11e7-a2b2-17481cb0c9b8")
            .build();

    BaseRecognizeCallback call = new BaseRecognizeCallback() {

        @Override
        public void onTranscription(SpeechResults speechResults) {
            System.out.println(speechResults);
            try {
                System.out.print(speechResults.getResults().get(0).getAlternatives().get(0).getTimestamps().get(0)
                        .getStartTime());
                System.out.print(" : ");
                System.out.println(speechResults.getResults().get(0).getAlternatives().get(0).getTranscript());
            } catch (NullPointerException npe) {
                // ignore
            }
        }

    };

    service.recognizeUsingWebSocket(audio, options, call);

    // wait 20 seconds for the asynchronous response
    Thread.sleep(20000);

DigitalZebra commented 7 years ago

Hi @shikida, increases in latency with 400 words should be very very minimal, so I don't think that's the issue. I would first verify that your corpus loaded and processed successfully. Next, I would make sure the model itself is in a good state. Take a look at this API: https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/#list_model

Does the response say that the model is in an available state?

shikida commented 7 years ago

The model seems ok.


{
  "customization_id": "19b28680-7452-11e7-a2b2-17481cb0c9b8",
  "created": "2017-07-29T08:36:02.536",
  "language": "en-US",
  "owner": "410e0660-26a8-4ee0-ba09-e101df7961e0",
  "name": "sprint",
  "description": "sprint",
  "base_model_name": "en-US_BroadbandModel",
  "status": "available",
  "progress": 100
}

please notice that without websockets, the transcription using the custom model works, while using websockets it fails.

I am not sure if this is somehow related to this bug.

https://developer.ibm.com/answers/questions/175860/speech-to-text-status-500-error.html

What I can do on my side is to implement the same call using a different programming language, using websockets, and check if it's related to the library being used.

DigitalZebra commented 7 years ago

Do other custom models on your same service work correctly? I'd maybe try creating a new custom model, add a word or two, train it, and then test that one. It could be an issue with your specific Watson service as well maybe?

germanattanasio commented 7 years ago

@shikida there was an issue around customization that was fixed in the SNAPSHOT release. Which version are you using?

shikida commented 7 years ago

STT service was provisioned and model was created a couple of days ago, Watson SDK I am using the latest stable 3.8.0. Can I get SDK snapshot using Maven or do I have to download? Sounds like something worth trying.

germanattanasio commented 7 years ago

IT's on sonatype. The readme describes how to connect your maven/gradle file to download the snapshot release.

watson-developer-cloud / java-sdk

[speech-to-text] websockets - Getting 500 Server Error while trying to recognize using custom model and websockets #754