corticph / prefix-beam-search

Code for prefix beam search tutorial by @labodk
https://medium.com/corti-ai/ctc-networks-and-language-models-prefix-beam-search-explained-c11d1ee23306
184 stars 37 forks source link

what is defintion of alphabet #2

Closed gobigrassland closed 5 years ago

gobigrassland commented 6 years ago

in function greedy_decoder, alphabet = list(ascii_lowercase) + [' ', '>']. But in function prefix_beam_search, alphabet = list(ascii_lowercase) + [' ', '>', '%']. i feel confused.

csukuangfj commented 5 years ago

@gobigrassland

From https://medium.com/corti-ai/ctc-networks-and-language-models-prefix-beam-search-explained-c11d1ee23306

our alphabet contains at least the letters A-Z, a space (_), and a blank token (-), where the latter is required by CTC networks. We will also be using an end-character (>) to be predicted after the last word.

In reality, it is up to you to define your alphabet depending on the actual application.

gobigrassland commented 5 years ago

@csukuangfj Thank you very much for your reply.