Closed BoneGoat closed 4 years ago
There seems to be something wrong with the multi-processing pool. I changed from line 501 in align.py from:
pool = multiprocessing.Pool(initializer=init_stt,
initargs=(output_graph_path, lm_path, trie_path),
processes=args.stt_workers)
transcripts = list(progress(pool.imap(stt, samples), desc='Transcribing', total=len(samples)))
to:
transcripts = []
for sample in samples:
init_stt(output_graph_path, lm_path, trie_path)
transcripts.append(stt(sample))
And now its no longer keeping files open. Previously you could run lsof to see a large number of deleted files kept open and now the files are no longer kept open. I'm not sure where the problem is but at least I can continue, albeit with only one thread.
Best regards
PR https://github.com/mozilla/DSAlign/pull/32 fixes this issue
Thanks!
Hi,
I have a catalog of over 30k files and align.py will spawn multiple times until it will crash with too many open files error:
Best regards