Decoding predictions to strings

sooftware / conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Apache License 2.0

943 stars 174 forks source link

Decoding predictions to strings #25

Closed Andrew-Brown1 closed 3 years ago

Andrew-Brown1 commented 3 years ago

Hi, thanks for the great repo.

the README Usage example gives outputs as a torch tensor of ints. How would you suggest decoding these to strings (the actual speech)?

Thanks!

sooftware commented 3 years ago

There should be a vocab file. like following:

examples = {
    0: "<pad>",
    1: "<sos>",
    2: "<eos>",
    3: "a",
    4: "b",
    ...
    ...
}

You can train with Conformer model this repo

sooftware commented 3 years ago

Sooner or later, a library will be unveiled to train multiple models, including the conformer transducer. I intend to release it as early as this week. I'll leave a comment here after I reveal it.

sooftware commented 3 years ago