shashikg / WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
MIT License
318 stars 32 forks source link

Non latin transcripts cannot be written to files #52

Closed EricBizet closed 4 months ago

EricBizet commented 8 months ago

Proposing to add utf-8 encoding to file writes when exporting results of transcripts. Default ASCII write mode did not allow Japanese characters to be written correctly for instance. Fixing https://github.com/shashikg/WhisperS2T/issues/53

BBC-Esq commented 6 months ago

I like it!

shashikg commented 4 months ago

Thanks for adding the fix @EricBizet !