coqui-ai / STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
https://coqui.ai
Mozilla Public License 2.0
2.23k stars 270 forks source link

Bug: Scorer.fill_dictiomary() Python function throws SWIG exception #2305

Open comodoro opened 1 year ago

comodoro commented 1 year ago

Python SWIG binding for the funczion Scorer::fill_dictionary does not work (it did in the old Mozilla code).

I am trying to make a fork of Mozilla DSAlign work with custom language models and got stuck on line https://github.com/comodoro/STT-align/blob/b5ef6f81fdb4b608a6cf2ba09f7d35a10f39ce25/align/generate_package.py#L74.

To Reproduce Steps to reproduce the behavior: Import Scorer from coqui_stt_ctcdecoder and try to run scorer.fill_dictionary In my case the error was

raceback (most recent call last):
  File "/mnt/d/shared/speech/dsalign/STT-align/align/align.py", line 693, in <module>
    main()
  File "/mnt/d/shared/speech/dsalign/STT-align/align/align.py", line 451, in main
    create_bundle(alphabet_path, scorer_path + '.' + 'lm.binary', scorer_path + '.' + 'vocab-500000.txt', scorer_path, False, 0.931289039105002, 1.1834137581510284)
  File "/mnt/d/shared/speech/dsalign/STT-align/align/generate_package.py", line 75, in create_bundle
    scorer.fill_dictionary(words)
  File "/mnt/d/shared/speech/dsalign/STT-align/venv/lib/python3.10/site-packages/coqui_stt_ctcdecoder/swigwrapper.py", line 1269, in fill_dictionary
    return _swigwrapper.Scorer_fill_dictionary(self, vocabulary)
TypeError: in method 'Scorer_fill_dictionary', argument 2 of type 'std::unordered_set< std::string > const &'

Expected behavior The function works same as in the C++ code on a list or maybe set.

Environment (please complete the following information):

Additional context I had posted it as a SO question: https://stackoverflow.com/questions/73900661/using-swig-python-wrapper-argument-2-of-type-stdunordered-set-stdstring. As I understand it, a templare needs to be added to swigwrapper.i