Open hpjang opened 2 days ago
See #314, #717, #789, #792, ...
Transcript words which do not contain characters in the alignment models dictionary e.g. "2014." or "£13.60" cannot be aligned and therefore are not given a timing.
The solution is to pass --suppress_numerals
or suppress_numerals=True
.
To provide a counter solution, if you want to keep the numerals, instead of suppressing them, and not have it convert a numeral such as 7
into seven
, then you can run some post-processing logic to look at the timestamp for the word before and after the numeral to fill-in the missing timestamp values for the numeral. This is the strategy we use and it works very well.
@randyburden That was actually my naive approach as well. Though what you describe may become problematic if the numeral is located at the beginning or end of the sentence, and say, you want to partition the audio there. Then you enter magic number territory with having to determine offsets and etc. I guess it ultimately depends on your use case
you can see 1462 dosen't have start and end's time stamp