espnet / espnet

End-to-End Speech Processing Toolkit
https://espnet.github.io/espnet/
Apache License 2.0
8.31k stars 2.16k forks source link

"cannot pickle 'SwigPyObject' object" error when using multhi-thread #3414

Closed shinotk15 closed 3 years ago

shinotk15 commented 3 years ago

I encounter "TypeError: cannot pickle 'SwigPyObject' object" when using ESPnet from my program. Is it possible to avoid the problem?

What I did with my code was to insert the following single line; nbests = speech2text(utterance) If I comment out this line, or if I do not enable multi-thread, the error does not occur.

As the preparation to use speech2text, the following lines are in my code.

from espnet2.bin.asr_inference import Speech2Text from espnet_model_zoo.downloader import ModelDownloader tag = 'Shinji Watanabe/spgispeech_asr_train_asr_conformer6_n_fft512_hop_length256_raw_en_unnorm_bpe5000_valid.acc.ave' d = ModelDownloader() speech2text = Speech2Text( **d.download_and_unpack(tag), device="cuda", minlenratio=0.0, maxlenratio=0.0, ctc_weight=0.3, beam_size=10, batch_size=0, nbest=1 )

In the espnet2/text/sentencepiece_tokenizer.py, I found the following memo. But I'm not sure what I can do. Where do you use Swig in ESPnet?

Don't build SentencePieceProcessor in init()

    # because it's not picklable and it may cause following error,
    # "TypeError: can't pickle SwigPyObject objects",

$ conda env export | egrep 'pytorch|espnet'

sw005320 commented 3 years ago

Thanks for your report.

https://colab.research.google.com/github/espnet/notebook/blob/master/espnet2_asr_realtime_demo.ipynb works well. So, I'm thinking that it might be due to some environment.

Could you share your Linux environments? We are mainly testing our tools with https://github.com/espnet/espnet/tree/master/.github/workflows where you can also find some libraries that are required.

shinotk15 commented 3 years ago

Thank you for the comment. We are running our codes on CentOS8 and Google Colab. The problem occurs when we use ESPnet with an RL library stable-baselines. This ColabNotebook reproduces the error. https://github.com/tttslab/languageacquisition-examples

We use speech2text() in DialogWorld class. The error does not occur if we replace it with a dummy function dummy_speech2text() that does not use ESPnet. The parallel processing is invoked by SubprocVecEnv used at the bottom of the notebook. In espnet/tools/Makefile, there are rm commands to remove some swig-related files. Where do you use swig in ESPnet?

sw005320 commented 3 years ago

Thanks for your detailed explanation.

In espnet/tools/Makefile, there are rm commands to remove some swig-related files. Where do you use swig in ESPnet?

At least, we don't use swig explicitly, but it can be used in the other tools.

shinotk15 commented 3 years ago

Ok. Thank you for the quick response and the clarification. Will try to find the dependency under espnet/tools.

kamo-naoyuki commented 3 years ago

Sentencepiece depends on swig for c++ binding. # Ideally, we should implement pure python sentencepiece.

When using multiprocessing with swig object, the error happens because swig object is not picklable.

# e.g.
p = multiprocessing.Popen(target=f, args=[swig_object])
p.start()

Do you use multithreading? I'm not sure how you used it. Could you provide your code?

However, our decoding process is not thread safe, so you can't use it with multithreading anyway. #3205

shinotk15 commented 3 years ago

Thank you for the information.

Could you provide your code?

A colab notebook languageacquisition_example.ipynb in this GitHub repository is the sample code to reproduce the error. https://github.com/tttslab/languageacquisition-examples

We are making a spoken dialogue simulator class (DialogWorld), and want to run multiple instances of it in parallel to perform parallelized reinforcement learning. We want to use speech2text from ESPnet in the class to implement speech recognition. The parallelization is by SubprocVecEnv, which is provided by stable_baselines3 RL library. SubprocVecEnv internally uses multi-thread, and causes the swig error if we use speech2text.

Sentencepiece depends on swig

In asr_inference, it seems Sentencepiece is only used through build_tokenizer. Is it (theoretically) true that we can remove the swig dependency by disabling the BPE post-processing to form the word units? What we need is only to run speech recognition using a pre-trained model. Any model is Ok as far as we obtain a reasonable recognition rate.

However, our decoding process is not thread safe

In the mentioned issue 3205, what was the reason that CTCPrefixScoreTH was not thread-safe? Is it fixed now?

We need to run speech recognition at every dialogue step since our language learning agent dynamically adjusts its pronunciation at the waveform level, and we want to speed up the process.

kamo-naoyuki commented 3 years ago

The parallelization is by SubprocVecEnv, which is provided by stable_baselines3 RL library. SubprocVecEnv internally uses multi-thread, and causes the swig error if we use speech2text.

SubprocVecEnv uses multiprocessing. It's not threading. See https://github.com/openai/baselines/blob/ea25b9e8b234e6ee1bca43083f8f3cf974143998/baselines/common/vec_env/subproc_vec_env.py#L1

I understood your issue, but basically, the error of "TypeError: cannot pickle 'SwigPyObject' object" shouldn't happen because I avoided it by lazy initialization. i.e. The following case is okay:

p = multiprocessing.Process(target=f, args=[speech2text])
p.start()  

However, if you once initialized the SwigObject in the main process, the error can happen.

speech2text(speech)
p = multiprocessing.Process(target=f, args=[speech2text])
p.start()  # The error happen

I guess your code falls in the second case.

This is all I can help for you. Please check your code by yourself.

shinotk15 commented 3 years ago

After I removed all the calls to speech2text(speech) before the multi-processing, it worked! An alternative solution is to make speech2text an attribute of the DialogWorld class so that each instance of the class has a dedicated body of speech2text.

Thanks a lot for your kind help.