Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.89k stars 1.85k forks source link

Can't initiate captioning in other languages using the python captioning quickstart #1714

Closed k10876 closed 2 years ago

k10876 commented 2 years ago

Hi,

I'm using the captioning scenario quickstart at cognitive-services-speech-sdk/scenarios/python/console/captioning/captioning.py, and I cannot change the transcription language using the --language command. Can the language selection feature be available soon?

Thanks, k10876

chschrae commented 2 years ago

Hi there,

Thanks for reaching out to us. The language feature should be supported today, can you specify the command you are using that is not working?

I noticed that in the sample the recognition language is not set by that command line option. Documentation for doing that is here: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-recognize-speech?pivots=programming-language-python#change-the-source-language

So if that is what you are referring to, I totally see what is wrong and you should be able to add this line at line 252 of your copy of captioning.py speech_config.speech_recognition_language=self._user_config["language"]

I will look into updating the repo with that change as well unless I discover some reason it shouldn't be that way.

k10876 commented 2 years ago

Hi,

Thanks for this follow-up, and I'm sure that this is definitely what I'm looking for. I was trying to add the --language flag (the command I was running is like python captioning.py --input source.wav --format any --output caption.output.txt --language zh-cn) and the transcription language remains to be in English.

However, I noticed that the description of --language flag in the USAGE says that this flag is only used to break captions into lines. I don't know what this actually means (breaking a English caption using config in other languages is truly weird), but I think that this might need to be changed as it's not in line with this feature. If my concern isn't necessary, I'm very sorry about that and please ignore that.

Anyway, thanks again for your comment, and I'm looking forward to a resolution to my issue.

k10876 commented 2 years ago

Hi,

I found a error when I'm running the modified code.

The error it raised is Exception in recognized_handler: 'charmap' codec can't encode characters in position 34-43: character maps to <undefined>

I'm using command python captioning.py --input source.wav --language zh-cn --format any --output caption.output.txt --srt --realTime --threshold 5 --delay 0 --profanity raw

The source.wav is a wave file of some Chinese speeches.

I have done some research and I suspect that a utf-8 encoding (or other encoding that supports Chinese characters) must be used to write the subtitles to this file, but I cannot find the right place to add this option.

Are there any suggestions for this? I would be very grateful if anyone can give me some instructions.

chschrae commented 2 years ago

yup I am working on updating the samples to honor the recognition language.

For the new error I found this SO https://stackoverflow.com/questions/44391671/python3-unicodeencodeerror-charmap-codec-cant-encode-characters-in-position

can you try adding this parameter encoding='utf-8' to the open line on 81 of helper.py and see if that solves your issue?

k10876 commented 2 years ago

Thanks for your instructions and I'm sure that this is currently working on my side.

I'll leave this ticket open as there might be something left behind, and you can surely close it if you want.

chschrae commented 2 years ago

Thanks again for finding this! I'll close this ticket when I merge the changes to the repo.