Closed sorenjmadsen closed 2 years ago
Nemo models and datasets are not pickleabl - we recommend using ddp in all cases when training.
However I don't know why it's occuring during inference - are you trying to launch that code on multiple threads ? That will not work. You need different processes independent from each other.
The above snippet is all I am trying to run. The pickle error is thrown when I call transcribe().
Stack trace:
---------------------------------------------------------------------------
PicklingError Traceback (most recent call last)
Input In [4], in <module>
1 asr_model = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="QuartzNet15x5Base-En", strict=False)
3 files = ['./soundsample.wav']
----> 4 for fname, transcription in zip(files, asr_model.transcribe(paths2audio_files=files)):
5 print(f"Audio in {fname} was recognized as: {transcription}")
File ~/opt/anaconda3/envs/nemo/lib/python3.9/site-packages/torch/autograd/grad_mode.py:28, in _DecoratorContextManager.__call__.<locals>.decorate_context(*args, **kwargs)
25 @functools.wraps(func)
26 def decorate_context(*args, **kwargs):
27 with self.__class__():
---> 28 return func(*args, **kwargs)
File ~/Documents/MetaSense.ai/NeMo/nemo/collections/asr/models/ctc_models.py:268, in EncDecCTCModel.transcribe(self, paths2audio_files, batch_size, logprobs, return_hypotheses, num_workers)
260 config = {
261 'paths2audio_files': paths2audio_files,
262 'batch_size': batch_size,
263 'temp_dir': tmpdir,
264 'num_workers': num_workers,
265 }
267 temporary_datalayer = self._setup_transcribe_dataloader(config)
--> 268 for test_batch in tqdm(temporary_datalayer, desc="Transcribing"):
269 logits, logits_len, greedy_predictions = self.forward(
270 input_signal=test_batch[0].to(device), input_signal_length=test_batch[1].to(device)
271 )
272 if logprobs:
273 # dump log probs per file
File ~/opt/anaconda3/envs/nemo/lib/python3.9/site-packages/tqdm/notebook.py:231, in tqdm_notebook.__init__(self, *args, **kwargs)
229 colour = kwargs.pop('colour', None)
230 display_here = kwargs.pop('display', True)
--> 231 super(tqdm_notebook, self).__init__(*args, **kwargs)
232 if self.disable or not kwargs['gui']:
233 self.disp = lambda *_, **__: None
File ~/opt/anaconda3/envs/nemo/lib/python3.9/site-packages/tqdm/asyncio.py:33, in tqdm_asyncio.__init__(self, iterable, *args, **kwargs)
31 self.iterable_next = iterable.__next__
32 else:
---> 33 self.iterable_iterator = iter(iterable)
34 self.iterable_next = self.iterable_iterator.__next__
File ~/opt/anaconda3/envs/nemo/lib/python3.9/site-packages/torch/utils/data/dataloader.py:359, in DataLoader.__iter__(self)
357 return self._iterator
358 else:
--> 359 return self._get_iterator()
File ~/opt/anaconda3/envs/nemo/lib/python3.9/site-packages/torch/utils/data/dataloader.py:305, in DataLoader._get_iterator(self)
303 else:
304 self.check_worker_number_rationality()
--> 305 return _MultiProcessingDataLoaderIter(self)
File ~/opt/anaconda3/envs/nemo/lib/python3.9/site-packages/torch/utils/data/dataloader.py:918, in _MultiProcessingDataLoaderIter.__init__(self, loader)
911 w.daemon = True
912 # NB: Process.start() actually take some time as it needs to
913 # start a process and pass the arguments over via a pipe.
914 # Therefore, we only add a worker to self._workers list after
915 # it started, so that we do not call .join() if program dies
916 # before it starts, and __del__ tries to join but will get:
917 # AssertionError: can only join a started process.
--> 918 w.start()
919 self._index_queues.append(index_queue)
920 self._workers.append(w)
File ~/opt/anaconda3/envs/nemo/lib/python3.9/multiprocessing/process.py:121, in BaseProcess.start(self)
118 assert not _current_process._config.get('daemon'), \
119 'daemonic processes are not allowed to have children'
120 _cleanup()
--> 121 self._popen = self._Popen(self)
122 self._sentinel = self._popen.sentinel
123 # Avoid a refcycle if the target function holds an indirect
124 # reference to the process object (see bpo-30775)
File ~/opt/anaconda3/envs/nemo/lib/python3.9/multiprocessing/context.py:224, in Process._Popen(process_obj)
222 @staticmethod
223 def _Popen(process_obj):
--> 224 return _default_context.get_context().Process._Popen(process_obj)
File ~/opt/anaconda3/envs/nemo/lib/python3.9/multiprocessing/context.py:284, in SpawnProcess._Popen(process_obj)
281 @staticmethod
282 def _Popen(process_obj):
283 from .popen_spawn_posix import Popen
--> 284 return Popen(process_obj)
File ~/opt/anaconda3/envs/nemo/lib/python3.9/multiprocessing/popen_spawn_posix.py:32, in Popen.__init__(self, process_obj)
30 def __init__(self, process_obj):
31 self._fds = []
---> 32 super().__init__(process_obj)
File ~/opt/anaconda3/envs/nemo/lib/python3.9/multiprocessing/popen_fork.py:19, in Popen.__init__(self, process_obj)
17 self.returncode = None
18 self.finalizer = None
---> 19 self._launch(process_obj)
File ~/opt/anaconda3/envs/nemo/lib/python3.9/multiprocessing/popen_spawn_posix.py:47, in Popen._launch(self, process_obj)
45 try:
46 reduction.dump(prep_data, fp)
---> 47 reduction.dump(process_obj, fp)
48 finally:
49 set_spawning_popen(None)
File ~/opt/anaconda3/envs/nemo/lib/python3.9/multiprocessing/reduction.py:60, in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60 ForkingPickler(file, protocol).dump(obj)
Huh. I think it was a mistake to make the data loader use more than one worker by default. Can you put your code under a if __name__ == "__main__"
block and try ?
Otherwise another option is pass num_workers=0 to transcribe()
When I put it under the conditional, it didn't run at all. However, setting the num_workers parameter worked! Thank you!
I had the same issue and num_workers=0 worked for me, but when I switched to a nemo VAD model instead of using ASR I got the following traceback with the error:
Traceback (most recent call last):
File "D:\Projects\Python\test\test\nemo_diarization.py", line 115, in <module>
diarization("audio/1.wav")
File "D:\Projects\Python\test\test\nemo_diarization.py", line 100, in diarization
sd_model.diarize()
File "C:\Program Files\Python310\lib\site-packages\nemo\collections\asr\models\clustering_diarizer.py", line 408, in diarize
self._perform_speech_activity_detection()
File "C:\Program Files\Python310\lib\site-packages\nemo\collections\asr\models\clustering_diarizer.py", line 301, in _perform_speech_activity_detection
manifest_vad_input = prepare_manifest(config)
File "C:\Program Files\Python310\lib\site-packages\nemo\collections\asr\parts\utils\vad_utils.py", line 74, in prepare_manifest
p = multiprocessing.Pool(processes=config['num_workers'])
File "C:\Program Files\Python310\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Program Files\Python310\lib\multiprocessing\pool.py", line 205, in __init__
raise ValueError("Number of processes must be at least 1")
ValueError: Number of processes must be at least 1
On line:
sd_model = ClusteringDiarizer(cfg=config)
sd_model.diarize()
When I switch to num_workers > 0 I receive the following error:
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "D:\Projects\Python\test\test\nemo_diarization.py", line 115, in <module>
diarization("audio/1.wav")
File "D:\Projects\Python\test\test\nemo_diarization.py", line 100, in diarization
sd_model.diarize()
File "C:\Program Files\Python310\lib\site-packages\nemo\collections\asr\models\clustering_diarizer.py", line 408, in diarize
self._perform_speech_activity_detection()
File "C:\Program Files\Python310\lib\site-packages\nemo\collections\asr\models\clustering_diarizer.py", line 308, in _perform_speech_activity_detection
self._run_vad(manifest_vad_input)
File "C:\Program Files\Python310\lib\site-packages\nemo\collections\asr\models\clustering_diarizer.py", line 213, in _run_vad
for i, test_batch in enumerate(tqdm(self._vad_model.test_dataloader())):
File "C:\Program Files\Python310\lib\site-packages\tqdm\std.py", line 1195, in __iter__
for obj in iterable:
File "C:\Program Files\Python310\lib\site-packages\torch\utils\data\dataloader.py", line 438, in __iter__
return self._get_iterator()
File "C:\Program Files\Python310\lib\site-packages\torch\utils\data\dataloader.py", line 384, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Program Files\Python310\lib\site-packages\torch\utils\data\dataloader.py", line 1048, in __init__
w.start()
File "C:\Program Files\Python310\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Program Files\Python310\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Python310\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "C:\Program Files\Python310\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
reduction.dump(process_obj, to_child)
File "C:\Program Files\Python310\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'nemo.collections.common.parts.preprocessing.collections.SpeechLabelEntity'>: attribute lookup SpeechLabelEntity on nemo.collections.common.parts.preprocessing.collections failed
Process finished with exit code 1
The following is the manifest file:
name: &name "ClusterDiarizer"
num_workers: 0
sample_rate: 16000
batch_size: 64
diarizer:
manifest_filepath: ???
out_dir: ???
oracle_vad: False # If True, uses RTTM files provided in manifest file to get speech activity (VAD) timestamps
collar: 0.25 # Collar value for scoring
ignore_overlap: True # Consider or ignore overlap segments while scoring
vad:
model_path: null # .nemo local model path or pretrained model name or none
external_vad_manifest: null # This option is provided to use external vad and provide its speech activity labels for speaker embeddings extraction. Only one of model_path or external_vad_manifest should be set
parameters: # Tuned parameters for CH109 (using the 11 multi-speaker sessions as dev set)
window_length_in_sec: 0.15 # Window length in sec for VAD context input
shift_length_in_sec: 0.01 # Shift length in sec for generate frame level VAD prediction
smoothing: "median" # False or type of smoothing method (eg: median)
overlap: 0.875 # Overlap ratio for overlapped mean/median smoothing filter
onset: 0.4 # Onset threshold for detecting the beginning and end of a speech
offset: 0.7 # Offset threshold for detecting the end of a speech
pad_onset: 0.05 # Adding durations before each speech segment
pad_offset: -0.1 # Adding durations after each speech segment
min_duration_on: 0.2 # Threshold for small non_speech deletion
min_duration_off: 0.2 # Threshold for short speech segment deletion
filter_speech_first: True
speaker_embeddings:
model_path: ??? # .nemo local model path or pretrained model name (titanet_large, ecapa_tdnn or speakerverification_speakernet)
parameters:
window_length_in_sec: 1.5 # Window length(s) in sec (floating-point number). Either a number or a list. Ex) 1.5 or [1.5,1.0,0.5]
shift_length_in_sec: 0.75 # Shift length(s) in sec (floating-point number). Either a number or a list. Ex) 0.75 or [0.75,0.5,0.25]
multiscale_weights: null # Weight for each scale. should be null (for single scale) or a list matched with window/shift scale count. Ex) [0.33,0.33,0.33]
save_embeddings: False # Save embeddings as pickle file for each audio input.
clustering:
parameters:
oracle_num_speakers: False # If True, use num of speakers value provided in manifest file.
max_num_speakers: 20 # Max number of speakers for each recording. If oracle num speakers is passed, this value is ignored.
enhanced_count_thres: 80 # If the number of segments is lower than this number, enhanced speaker counting is activated.
max_rp_threshold: 0.25 # Determines the range of p-value search: 0 < p <= max_rp_threshold.
sparse_search_volume: 30 # The higher the number, the more values will be examined with more time.
maj_vote_spk_count: False # If True, take a majority vote on multiple p-values to estimate the number of speakers.
# json manifest line example
# {"audio_filepath": "/path/to/audio_file", "offset": 0, "duration": null, "label": "infer", "text": "-", "num_speakers": null, "rttm_filepath": "/path/to/rttm/file", "uem_filepath": "/path/to/uem/filepath"}
Can you provide a notebook to reproduce the error?
Hello! The issue was solved when I updated nemo-toolkit with the following command:
python -m pip install git+https://github.com/NVIDIA/NeMo.git@main#egg=nemo_toolkit[asr]
It allowed me to specify num_workers: 0 in the "ClusterDiarizer" manifest to avoid the following error, which occurs when a number of workers is not equal to 0:
_pickle.PicklingError: Can't pickle <class 'nemo.collections.common.parts.preprocessing.collections.SpeechLabelEntity'>: attribute lookup SpeechLabelEntity on nemo.collections.common.parts.preprocessing.collections failed
After updating toolkit ...still getting PicklingError: Can't pickle <class 'nemo.collections.common.parts.preprocessing.collections.SpeechLabelEntity'>: attribute lookup SpeechLabelEntity on nemo.collections.common.parts.preprocessing.collections failed
Describe the bug On attempting inference of a recording, I received the following error: PicklingError: Can't pickle <class 'nemo.collections.common.parts.preprocessing.collections.AudioTextEntity'>: attribute lookup AudioTextEntity on nemo.collections.common.parts.preprocessing.collections failed
Steps/Code to reproduce bug
Environment overview
Environment details