alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.92k stars 1.1k forks source link

Can i get timings of the utterance #373

Open UncleBansh opened 3 years ago

UncleBansh commented 3 years ago

Hi Can i get timecode in file?

now i get something like this

{
  "result" : [{
      "conf" : 1.000000,
      "end" : 13.230000,
      "start" : 12.750000,
      "word" : "я"
    }, {
      "conf" : 0.990503,
      "end" : 13.380000,
      "start" : 13.260000,
      "word" : "в"
    }, {
      "conf" : 0.963313,
      "end" : 13.859370,
      "start" : 13.380000,
      "word" : "пятницу"
    }, {
      "conf" : 0.551499,
      "end" : 14.399004,
      "start" : 13.860000,
      "word" : "сбрасывать"
    }, {
      "conf" : 0.671136,
      "end" : 15.089973,
      "start" : 14.399004,
      "word" : "заявление"
    }, {
      "conf" : 0.271235,
      "end" : 15.838691,
      "start" : 15.240000,
      "word" : "майру"
    }, {
      "conf" : 0.656716,
      "end" : 16.590000,
      "start" : 16.050000,
      "word" : "продление"
    }, {
      "conf" : 1.000000,
      "end" : 17.191490,
      "start" : 16.590000,
      "word" : "контракта"
    }],
  "text" : "я в пятницу сбрасывать заявление майру продление контракта"
}

what i want see

{
  "result" : [{
      "conf" : 1.000000,
      "end" : 13.230000,
      "start" : 12.750000,
      "word" : "я"
    }, {
      "conf" : 0.990503,
      "end" : 13.380000,
      "start" : 13.260000,
      "word" : "в"
    }, {
      "conf" : 0.963313,
      "end" : 13.859370,
      "start" : 13.380000,
      "word" : "пятницу"
    }, {
      "conf" : 0.551499,
      "end" : 14.399004,
      "start" : 13.860000,
      "word" : "сбрасывать"
    }, {
      "conf" : 0.671136,
      "end" : 15.089973,
      "start" : 14.399004,
      "word" : "заявление"
    }, {
      "conf" : 0.271235,
      "end" : 15.838691,
      "start" : 15.240000,
      "word" : "майру"
    }, {
      "conf" : 0.656716,
      "end" : 16.590000,
      "start" : 16.050000,
      "word" : "продление"
    }, {
      "conf" : 1.000000,
      "end" : 17.191490,
      "start" : 16.590000,
      "word" : "контракта"
    }],
  "text" : "я в пятницу сбрасывать заявление майру продление контракта"
  "timecode" : "1m 25s"
}
nshmyrev commented 3 years ago

You can compute it yourself from the words, see here:

https://github.com/alphacep/vosk-api/blob/fe91b5a717e6ec8e04621ea7d7b988394296705e/python/example/test_srt.py#L50