Open hccho2 opened 2 years ago
#id\char 0 _ 1 2 ㄱ ... 52 ㅄ 53 <s> 54 </s>
0부터 54까지 모두 55개인데, yaml파일의 vocab_size는 왜 54로 되어 있나요?
제 코드에 대해서 관심 가져주셔서 감사합니다.
일단 54개로 설정한 것은 실수인 것 같습니다.
제가 과거에 space를 없애고 cer결과를 뽑는 실험을 했는데 그때 수정했어야 하는데 꼼꼼히 확인하지 못했습니다.
아마도 54개로 학습을 진행하셔도 RNN-T만 학습 하신다면 sos token과 eos token이 나오지 않아서 에러가 생기지 않을 것으로 예상됩니다.
감사합니다.
0부터 54까지 모두 55개인데, yaml파일의 vocab_size는 왜 54로 되어 있나요?