Integrating Phone-Based lang (Lexicon ) into Zipformer Model

I'm seeking guidance on how to incorporate a Phone-Based Language Lexicon (in icefall/egs/librispeech/ASR/prepare.sh in Step 6) into the latest Zipformer Model, a state-of-the-art solution in speech recognition.

I'm unsure about which parameters need adjustment in the Zipformer Model Architecture to optimize performance specifically for phone-level recognition, rather than sub-word or sentence-piece levels, which are typical in Byte Pair Encoding (BPE) models.

Description: I understand the benefits of open vocabulary systems like BPE, which eliminate the need for prior knowledge of word pronunciation, I'm unsure how BPE handles variations in word pronunciation found in training materials or using words in text training material without normalization them to all lower or upper characters. Additionally, during decoding, there's a possibility of encountering words with multiple variants or specific terminology (such as legal or medical terms or special foreign words) that may contain some tokens do not be in BPE model or not in the token list (tokens.txt)! -How does BPE handle variations in word pronunciation during training and decoding !, What strategies can I use to address the limitations of BPE models when encountering specialized terminology or words with multiple variants during decoding! This might be the drawback of using BEP based lexicon system.

I have few questions: 1-How can I effectively use a Phone-Based Language Lexicon into the Zipformer Model ? And which Zipformer model or recipe shall be used ? 2- Which parameters in the Zipformer Model Architecture (that layers run at different speeds) should be adjusted or tuned to be able to work well with phone level not sub-word level or sentence piece level Byte Pair Encoding (BPE) as this model designed for ?

Also, It would be good for me to compare the old technology using the TDNN Model in original Kaldi to the Zipformer Model in Next-gen Kaldi icefall using Phone-Based lexicon with the same dataset and also in different languages.

Any advice on these questions would be greatly appreciated. Thanks in advance.

k2-fsa / icefall

Integrating Phone-Based lang (Lexicon ) into Zipformer Model #1606