mozilla / DSAlign

DeepSpeech based forced alignment tool
Mozilla Public License 2.0
235 stars 33 forks source link

Large catalog spawns many processes which won't die #26

Closed BoneGoat closed 4 years ago

BoneGoat commented 4 years ago

Hi,

I have a catalog of over 30k files and align.py will spawn multiple times until it will crash with too many open files error:

OSError: SoX failed! [Errno 24] Too many open files
ERROR:sox:OSError: SoX failed! [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/tobias/dev/git/DSAlign/align/align.py", line 689, in <module>
    main()
  File "/home/tobias/dev/git/DSAlign/align/align.py", line 499, in main
    samples = list(progress(pre_filter(), desc='VAD splitting'))
  File "/home/tobias/dev/git/DSAlign/venv/lib/python3.7/site-packages/tqdm/std.py", line 1107, in __iter__
    for obj in iterable:
  File "/home/tobias/dev/git/DSAlign/align/align.py", line 488, in pre_filter
    for i, segment in enumerate(segments):
  File "/home/tobias/dev/git/DSAlign/align/audio.py", line 225, in vad_split
    for frame_index, frame in enumerate(audio_frames):
  File "/home/tobias/dev/git/DSAlign/align/audio.py", line 200, in read_frames_from_file
    with AudioFile(audio_path, audio_format=audio_format) as wav_file:
  File "/home/tobias/dev/git/DSAlign/align/audio.py", line 173, in __enter__
    convert_audio(self.audio_path, self.tmp_file_path, file_type='wav', audio_format=self.audio_format)
  File "/home/tobias/dev/git/DSAlign/align/audio.py", line 134, in convert_audio
    transformer.build(src_audio_path, dst_audio_path)
  File "/home/tobias/dev/git/DSAlign/venv/lib/python3.7/site-packages/sox/transform.py", line 441, in build
    "Stdout: {}\nStderr: {}".format(out, err)
sox.core.SoxError: Stdout: None
Stderr: None
Every 2.0s: ps aux |grep align.py                                                                                                                                           nasapa: Sat Feb 29 19:30:43 2020

tobias   17405  1.2  0.7 2868040 127720 pts/1  Sl+  19:12   0:13 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   17432  6.2  1.7 1163416 284784 pts/1  Sl+  19:12   1:09 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   17662  6.6  1.7 1164664 287544 pts/1  Sl+  19:13   1:08 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   17801  4.1  1.7 1161720 285876 pts/1  Sl+  19:14   0:40 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   17924  8.5  1.7 1237952 291292 pts/1  Sl+  19:14   1:21 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   18072  9.9  1.7 1458364 290144 pts/1  Sl+  19:16   1:26 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   18212  1.6  1.7 1679032 288132 pts/1  Sl+  19:17   0:13 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   18331  7.1  1.7 1900204 293768 pts/1  Sl+  19:17   0:56 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   18465  1.9  1.7 2121456 289700 pts/1  Sl+  19:18   0:14 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   18583 10.0  1.8 2342652 294884 pts/1  Sl+  19:18   1:13 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   18719 14.2  1.8 2567492 301520 pts/1  Sl+  19:19   1:34 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   18872 11.7  1.8 2786148 296664 pts/1  Sl+  19:21   1:08 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   19008 13.6  1.8 2810460 297716 pts/1  Sl+  19:22   1:10 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   19145 17.2  1.8 2835508 301320 pts/1  Sl+  19:23   1:18 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   19282  8.8  1.8 2859920 299984 pts/1  Sl+  19:24   0:34 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   19402 25.4  1.8 2884740 302236 pts/1  Sl+  19:24   1:30 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   19553 21.7  1.8 2909136 304040 pts/1  Sl+  19:26   1:00 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   19677  7.7  1.8 2933740 302824 pts/1  Sl+  19:26   0:17 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   19812 43.2  1.8 2959768 308860 pts/1  Sl+  19:27   1:29 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   19837  0.2  0.0  15204  3652 pts/0    S+   19:27   0:00 watch ps aux |grep align.py
tobias   20066 77.1  1.8 2983964 307360 pts/1  Sl+  19:28   1:39 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   20395 70.2  1.8 3008568 307056 pts/1  Sl+  19:30   0:29 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   20566  113  1.8 3033172 310400 pts/1  Sl+  19:30   0:18 python /home/tobias/dev/git/DSAlign/align/align.py --output-max-cer 15 --loglevel 10 --stt-model-dir models/sv --stt-workers 1 --catalog /m
tobias   20611  0.0  0.0  15204  1156 pts/0    S+   19:30   0:00 watch ps aux |grep align.py
tobias   20612  0.0  0.0   4628   920 pts/0    S+   19:30   0:00 sh -c ps aux |grep align.py
tobias   20614  0.0  0.0  14680  1028 pts/0    S+   19:30   0:00 grep align.py

Best regards

BoneGoat commented 4 years ago

There seems to be something wrong with the multi-processing pool. I changed from line 501 in align.py from:

pool = multiprocessing.Pool(initializer=init_stt,
                                        initargs=(output_graph_path, lm_path, trie_path),
                                        processes=args.stt_workers)
transcripts = list(progress(pool.imap(stt, samples), desc='Transcribing', total=len(samples)))

to:

transcripts = []
for sample in samples:
  init_stt(output_graph_path, lm_path, trie_path)
  transcripts.append(stt(sample))

And now its no longer keeping files open. Previously you could run lsof to see a large number of deleted files kept open and now the files are no longer kept open. I'm not sure where the problem is but at least I can continue, albeit with only one thread.

Best regards

BoneGoat commented 4 years ago

PR https://github.com/mozilla/DSAlign/pull/32 fixes this issue

tilmankamp commented 4 years ago

Thanks!