watson-developer-cloud / java-sdk

:1st_place_medal: Java SDK to use the IBM Watson services.
http://watson-developer-cloud.github.io/java-sdk/
Apache License 2.0
590 stars 533 forks source link

Is it possible to provide an example on how to deal with long audio files? #205

Closed shikida closed 8 years ago

shikida commented 8 years ago

I am trying to transcript a long audio file (20 minutes - 7MB) but transcription ends in the first minute, no matter if I use http or websockets (I am trying the examples)

TIA

Leo

ps.

    private static CountDownLatch lock = new CountDownLatch(1);

public static void main(String[] args) throws IOException, InterruptedException {
    WatsonAPI api = new WatsonAPI();
    FileInputStream audio = new FileInputStream("/home/leoks/git/qi/1945-01-07-CBS-World-News-Today_resample16k.ogg");

    RecognizeOptions options = new RecognizeOptions();
    options.continuous(true).interimResults(true).contentType(HttpMediaType.AUDIO_OGG);

    api.stt.recognizeUsingWebSockets(audio, options, new BaseRecognizeDelegate() {
      @Override
      public void onMessage(SpeechResults speechResults) {
        System.out.println(speechResults);
        if (speechResults.isFinal()) <<<<<<<<<<< line # 126
          lock.countDown();
      }
    });

    lock.await(2, TimeUnit.SECONDS);
   }

getting

 java.lang.IndexOutOfBoundsException: Index: 2, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at          com.ibm.watson.developer_cloud.speech_to_text.v1.model.SpeechResults.isFinal(SpeechResults.java:76)
at qi.watson.WatsonAPI$1.onMessage(WatsonAPI.java:126)
at      com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:72)
at      com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233)
at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211)
at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910)
at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693)
at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102)
at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)
daniel-bolanos commented 8 years ago

@germanattanasio , we need to fix this one. This seems like an error parsing the hypotheses, can you please take a look at that?

thank you

Dani

germanattanasio commented 8 years ago

@shikida make sure that continuous = true otherwise the recognition will stop if a half-second silence is detected. See the documentation

Continuous transmission

The continuous parameter accepts a Boolean value that indicates how the service is to handle silence. By default, the service stops transcription at the first pause, which is denoted by a half-second of non-speech (typically silence), or when the stream terminates. This is referred to as an end of speech (EOS) incident. By setting continuous to true, you instruct the service to transcribe the entire audio stream until the stream terminates. In this case, the results can include multiple transcript elements to indicate phrases separated by pauses. You can concatenate the transcript elements to assemble the complete transcription of the audio stream.

karthiktsm commented 8 years ago

After updating to 2.9.0 also same issues is throwing IndexOutOfBounds exception.

java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(Unknown Source) at java.util.ArrayList.get(Unknown Source) at com.ibm.watson.developer_cloud.speech_to_text.v1.model.SpeechResults.isFinal(SpeechResults.java:76) at com.uniphore.platform.voice.asr.phonetic.transcriber.ibm.model.RecognizeUsingWebSocketsExample$1.onMessage(RecognizeUsingWebSocketsExample.java:35) at com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:72) at com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352) at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233) at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211) at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910) at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693) at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102) at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)

java.lang.NullPointerException at java.io.StringReader.(Unknown Source) at com.google.gson.JsonParser.parse(JsonParser.java:45) at com.ibm.watson.developer_cloud.speech_to_text.v1.websocket.WebSocketSpeechToTextClient$WebSocketListener.onTextMessage(WebSocketSpeechToTextClient.java:66) at com.neovisionaries.ws.client.ListenerManager.callOnTextMessage(ListenerManager.java:352) at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:233) at com.neovisionaries.ws.client.ReadingThread.callOnTextMessage(ReadingThread.java:211) at com.neovisionaries.ws.client.ReadingThread.handleTextFrame(ReadingThread.java:910) at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:693) at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:102) at com.neovisionaries.ws.client.ReadingThread.run(ReadingThread.java:61)

karthiktsm commented 8 years ago

Error throwing while using web-socket jsoncom.google.gson.JsonSyntaxException: com.google.gson.stream.MalformedJsonException: Unterminated array at line 39 column 38 path $.results[0].alternatives[0].timestamps[6][2] java.lang.RuntimeException: Error parsing the incoming message: {

germanattanasio commented 8 years ago

mmmm I think I'm going to change the websocket library...

TakahikoKawasaki commented 8 years ago

The current code:

public void onTextMessage(WebSocket websocket, String message) {
  try {
    JsonObject json = new JsonParser().parse(message).getAsJsonObject();

does not check whether message is null or not before passing it to parse(String) method.

How about adding the code like below?

if (message == null)
{
    message = "";
}
TechnoKanika commented 8 years ago

is this change working for you ?