The text cleaning STT went fine, however, when the alignment starts I have a StopIteration error in file align.py. Complete output :
DEBUG:root:Start
DEBUG:root:Looking for model files in "deepspeech-0.5.1-models"...
DEBUG:root:Loading alphabet from "deepspeech-0.5.1-models/alphabet.txt"...
DEBUG:root:Loading acoustic model from "deepspeech-0.5.1-models/output_graph.pb", alphabet from "deepspeech-0.5.1-models/alphabet.txt" and language model from "deepspeech-0.5.1-models/lm.binary"...
DEBUG:root:Transcribing VAD segments...
DEBUG:pydub.converter:subprocess.call(['ffmpeg', '-y', '-i', 'data/test2/asyoulikeit_0_shakespeare_64kb.mp3', '-acodec', 'pcm_s16le', '-vn', '-f', 'wav', '-'])
VAD splitting: 55it [00:00, 1000.69it/s]
DEBUG:root:Process 41639: Loaded models
TensorFlow: v1.13.1-10-g3e0cc53
DeepSpeech: v0.5.1-0-g4b29b78
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2019-10-07 12:35:37.356386: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Transcribing: 0%| | 0/55 [00:00<?, ?it/s]2019-10-07 12:35:38.310379: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "UnwrapDatasetVariant" device_type: "CPU"') for unknown op: UnwrapDatasetVariant
2019-10-07 12:35:38.310440: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "WrapDatasetVariant" device_type: "GPU" host_memory_arg: "input_handle" host_memory_arg: "output_handle"') for unknown op: WrapDatasetVariant
2019-10-07 12:35:38.310458: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "WrapDatasetVariant" device_type: "CPU"') for unknown op: WrapDatasetVariant
2019-10-07 12:35:38.310672: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "UnwrapDatasetVariant" device_type: "GPU" host_memory_arg: "input_handle" host_memory_arg: "output_handle"') for unknown op: UnwrapDatasetVariant
DEBUG:root:Process 41639: Transcribing...
2019-10-07 12:35:43.184433: W tensorflow/core/framework/allocator.cc:124] Allocation of 134217728 exceeds 10% of system memory.
2019-10-07 12:35:43.666503: W tensorflow/core/framework/allocator.cc:124] Allocation of 134217728 exceeds 10% of system memory.
2019-10-07 12:35:44.440064: W tensorflow/core/framework/allocator.cc:124] Allocation of 134217728 exceeds 10% of system memory.
2019-10-07 12:35:44.542822: W tensorflow/core/framework/allocator.cc:124] Allocation of 134217728 exceeds 10% of system memory.
2019-10-07 12:35:44.646766: W tensorflow/core/framework/allocator.cc:124] Allocation of 134217728 exceeds 10% of system memory.
DEBUG:root:Process 41639: as you like it by william shakespeare
DEBUG:root:Process 41639: Transcribing...
Transcribing: 2%|██▎ | 1/55 [00:10<09:17, 10.32s/it]DEBUG:root:Process 41639: this is a liberator ing
DEBUG:root:Process 41639: Transcribing...
Transcribing: 4%|████▋ | 2/55 [00:11<06:44, 7.64s/it]DEBUG:root:Process 41639: all over but recording or in the public domain
DEBUG:root:Process 41639: Transcribing...
Transcribing: 5%|██████▉ | 3/55 [00:13<05:11, 5.99s/it]DEBUG:root:Process 41639: for more information or to volunteer
DEBUG:root:Process 41639: Transcribing...
Transcribing: 7%|█████████▎ | 4/55 [00:15<04:03, 4.78s/it]DEBUG:root:Process 41639: is it liberal
DEBUG:root:Process 41639: Transcribing...
Transcribing: 9%|███████████▋ | 5/55 [00:17<03:05, 3.72s/it]DEBUG:root:Process 41639:
DEBUG:root:Process 41639: Transcribing...
Transcribing: 11%|█████████████▉ | 6/55 [00:17<02:18, 2.83s/it]DEBUG:root:Process 41639: dramatis personae
DEBUG:root:Process 41639: Transcribing...
Transcribing: 13%|████████████████▎ | 7/55 [00:19<01:54, 2.39s/it]DEBUG:root:Process 41639: du seen the red by heavy
DEBUG:root:Process 41639: Transcribing...
Transcribing: 15%|██████████████████▌ | 8/55 [00:21<01:54, 2.43s/it]DEBUG:root:Process 41639: to frederick
DEBUG:root:Process 41639: Transcribing...
Transcribing: 16%|████████████████████▉ | 9/55 [00:22<01:32, 2.01s/it]DEBUG:root:Process 41639: by
DEBUG:root:Process 41639: Transcribing...
Transcribing: 18%|███████████████████████ | 10/55 [00:23<01:13, 1.64s/it]DEBUG:root:Process 41639: sugah
DEBUG:root:Process 41639: Transcribing...
Transcribing: 20%|█████████████████████████▍ | 11/55 [00:24<01:03, 1.45s/it]DEBUG:root:Process 41639: he means read by cecilia prior
DEBUG:root:Process 41639: Transcribing...
Transcribing: 22%|███████████████████████████▋ | 12/55 [00:26<01:11, 1.66s/it]DEBUG:root:Process 41639: jack was
DEBUG:root:Process 41639: Transcribing...
Transcribing: 24%|██████████████████████████████ | 13/55 [00:27<01:01, 1.47s/it]DEBUG:root:Process 41639: read by elizabeth let
DEBUG:root:Process 41639: Transcribing...
Transcribing: 25%|████████████████████████████████▎ | 14/55 [00:29<00:59, 1.44s/it]DEBUG:root:Process 41639: labo
DEBUG:root:Process 41639: Transcribing...
Transcribing: 27%|██████████████████████████████████▋ | 15/55 [00:29<00:51, 1.29s/it]DEBUG:root:Process 41639: red by simon lover
DEBUG:root:Process 41639: Transcribing...
Transcribing: 29%|████████████████████████████████████▉ | 16/55 [00:31<00:55, 1.42s/it]DEBUG:root:Process 41639: charles
DEBUG:root:Process 41639: Transcribing...
Transcribing: 31%|███████████████████████████████████████▎ | 17/55 [00:32<00:46, 1.22s/it]DEBUG:root:Process 41639: read by me on my
DEBUG:root:Process 41639: Transcribing...
Transcribing: 33%|█████████████████████████████████████████▌ | 18/55 [00:33<00:47, 1.28s/it]DEBUG:root:Process 41639: and even
DEBUG:root:Process 41639: Transcribing...
Transcribing: 35%|███████████████████████████████████████████▊ | 19/55 [00:34<00:41, 1.15s/it]DEBUG:root:Process 41639: red bateiseki
DEBUG:root:Process 41639: Transcribing...
Transcribing: 36%|██████████████████████████████████████████████▏ | 20/55 [00:36<00:44, 1.28s/it]DEBUG:root:Process 41639: jake was the air
DEBUG:root:Process 41639: Transcribing...
Transcribing: 38%|████████████████████████████████████████████████▍ | 21/55 [00:37<00:43, 1.28s/it]DEBUG:root:Process 41639: red by david lawrence
DEBUG:root:Process 41639: Transcribing...
Transcribing: 40%|██████████████████████████████████████████████████▊ | 22/55 [00:38<00:43, 1.31s/it]DEBUG:root:Process 41639: part of orlando
DEBUG:root:Process 41639: Transcribing...
Transcribing: 42%|█████████████████████████████████████████████████████ | 23/55 [00:40<00:42, 1.33s/it]DEBUG:root:Process 41639: by m b
Transcribing: 44%|███████████████████████████████████████████████████████▍ | 24/55 [00:41<00:42, 1.36s/it]DEBUG:root:Process 41639: Transcribing...
DEBUG:root:Process 41639: adam
DEBUG:root:Process 41639: Transcribing...
Transcribing: 45%|█████████████████████████████████████████████████████████▋ | 25/55 [00:42<00:34, 1.17s/it]DEBUG:root:Process 41639: the papeete
DEBUG:root:Process 41639: Transcribing...
Transcribing: 47%|████████████████████████████████████████████████████████████ | 26/55 [00:44<00:39, 1.37s/it]DEBUG:root:Process 41639: denis
DEBUG:root:Process 41639: Transcribing...
Transcribing: 49%|██████████████████████████████████████████████████████████████▎ | 27/55 [00:45<00:32, 1.17s/it]DEBUG:root:Process 41639: red by rosemont
DEBUG:root:Process 41639: Transcribing...
Transcribing: 51%|████████████████████████████████████████████████████████████████▋ | 28/55 [00:46<00:32, 1.20s/it]DEBUG:root:Process 41639: touched down played by mark smith
DEBUG:root:Process 41639: Transcribing...
Transcribing: 53%|██████████████████████████████████████████████████████████████████▉ | 29/55 [00:48<00:38, 1.48s/it]DEBUG:root:Process 41639: line for sir oliver mar text read by rondelet
DEBUG:root:Process 41639: Transcribing...
Transcribing: 55%|█████████████████████████████████████████████████████████████████████▎ | 30/55 [00:51<00:51, 2.07s/it]DEBUG:root:Process 41639: saint louis missouri
DEBUG:root:Process 41639: Transcribing...
Transcribing: 56%|███████████████████████████████████████████████████████████████████████▌ | 31/55 [00:53<00:43, 1.82s/it]DEBUG:root:Process 41639: corin read by beladen
DEBUG:root:Process 41639: Transcribing...
Transcribing: 58%|█████████████████████████████████████████████████████████████████████████▉ | 32/55 [00:55<00:43, 1.90s/it]DEBUG:root:Process 41639: out of
DEBUG:root:Process 41639: Transcribing...
Transcribing: 60%|████████████████████████████████████████████████████████████████████████████▏ | 33/55 [00:56<00:35, 1.61s/it]DEBUG:root:Process 41639: sylvie
DEBUG:root:Process 41639: Transcribing...
Transcribing: 62%|██████████████████████████████████████████████████████████████████████████████▌ | 34/55 [00:56<00:28, 1.37s/it]DEBUG:root:Process 41639: red by
DEBUG:root:Process 41639: Transcribing...
Transcribing: 64%|████████████████████████████████████████████████████████████████████████████████▊ | 35/55 [00:57<00:24, 1.22s/it]DEBUG:root:Process 41639: david's nickel
DEBUG:root:Process 41639: Transcribing...
Transcribing: 65%|███████████████████████████████████████████████████████████████████████████████████▏ | 36/55 [00:58<00:21, 1.11s/it]DEBUG:root:Process 41639: william read by even but in our
Transcribing: 67%|█████████████████████████████████████████████████████████████████████████████████████▍ | 37/55 [01:01<00:27, 1.55s/it]DEBUG:root:Process 41639: Transcribing...
DEBUG:root:Process 41639: i mind read by lorella anderson
DEBUG:root:Process 41639: Transcribing...
Transcribing: 69%|███████████████████████████████████████████████████████████████████████████████████████▋ | 38/55 [01:03<00:31, 1.84s/it]DEBUG:root:Process 41639: the part of rosalind
DEBUG:root:Process 41639: Transcribing...
Transcribing: 71%|██████████████████████████████████████████████████████████████████████████████████████████ | 39/55 [01:05<00:27, 1.73s/it]DEBUG:root:Process 41639: read by rosalind will
DEBUG:root:Process 41639: Transcribing...
Transcribing: 73%|████████████████████████████████████████████████████████████████████████████████████████████▎ | 40/55 [01:06<00:24, 1.63s/it]DEBUG:root:Process 41639: sea read by felipe
DEBUG:root:Process 41639: Transcribing...
Transcribing: 75%|██████████████████████████████████████████████████████████████████████████████████████████████▋ | 41/55 [01:08<00:22, 1.63s/it]DEBUG:root:Process 41639: see
DEBUG:root:Process 41639: Transcribing...
Transcribing: 76%|████████████████████████████████████████████████████████████████████████████████████████████████▉ | 42/55 [01:08<00:17, 1.36s/it]DEBUG:root:Process 41639: red by charlie veemeth
DEBUG:root:Process 41639: Transcribing...
Transcribing: 78%|███████████████████████████████████████████████████████████████████████████████████████████████████▎ | 43/55 [01:10<00:17, 1.46s/it]DEBUG:root:Process 41639: are
DEBUG:root:Process 41639: Transcribing...
Transcribing: 80%|█████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 44/55 [01:11<00:14, 1.29s/it]DEBUG:root:Process 41639: tad by mandy eh
DEBUG:root:Process 41639: Transcribing...
Transcribing: 82%|███████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 45/55 [01:13<00:13, 1.38s/it]DEBUG:root:Process 41639: first lord
DEBUG:root:Process 41639: Transcribing...
Transcribing: 84%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 46/55 [01:14<00:11, 1.30s/it]DEBUG:root:Process 41639: red by ananus
DEBUG:root:Process 41639: Transcribing...
Transcribing: 85%|████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 47/55 [01:15<00:11, 1.38s/it]DEBUG:root:Process 41639: second lord
DEBUG:root:Process 41639: Transcribing...
Transcribing: 87%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 48/55 [01:16<00:08, 1.27s/it]DEBUG:root:Process 41639: red by david lawrence
DEBUG:root:Process 41639: Transcribing...
Transcribing: 89%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 49/55 [01:18<00:08, 1.33s/it]DEBUG:root:Process 41639: first page read by ruth golding
DEBUG:root:Process 41639: Transcribing...
Transcribing: 91%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 50/55 [01:20<00:08, 1.64s/it]DEBUG:root:Process 41639: second page
DEBUG:root:Process 41639: Transcribing...
Transcribing: 93%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 51/55 [01:21<00:05, 1.48s/it]DEBUG:root:Process 41639: read by david a canal
DEBUG:root:Process 41639: Transcribing...
Transcribing: 95%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 52/55 [01:23<00:04, 1.44s/it]DEBUG:root:Process 41639: the forester played by jack in
DEBUG:root:Process 41639: Transcribing...
Transcribing: 96%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 53/55 [01:25<00:03, 1.65s/it]DEBUG:root:Process 41639: stage directions read by marian waldon
Transcribing: 98%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 54/55 [01:27<00:01, 1.96s/it]DEBUG:root:Process 41639: Transcribing...
DEBUG:root:Process 41639: and a dramatis persona
Transcribing: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 55/55 [01:29<00:00, 1.63s/it]
DEBUG:root:Excluded 0 empty transcripts
DEBUG:root:Writing transcription log to file "data/test2/tlog.tlog"...
DEBUG:root:Loading script from data/test2/transcript.script...
Aligning: 0%| | 0/1 [00:00<?, ?it/s]DEBUG:root:Loading transcription log from data/test2/tlog.tlog...
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/people/lerner/DSAlign/align/text.py", line 163, in ngrams
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/people/lerner/anaconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/people/lerner/DSAlign/align/align.py", line 146, in align
matched_fragments = list(filter(lambda f: f is not None, matched_fragments))
File "/people/lerner/DSAlign/align/align.py", line 136, in split_match
for f in split_match(fragments[0:index], start=start, end=match_start):
File "/people/lerner/DSAlign/align/align.py", line 136, in split_match
for f in split_match(fragments[0:index], start=start, end=match_start):
File "/people/lerner/DSAlign/align/align.py", line 136, in split_match
for f in split_match(fragments[0:index], start=start, end=match_start):
File "/people/lerner/DSAlign/align/align.py", line 129, in split_match
match = search.find_best(fragment['transcript'], start=start, end=end)
File "/people/lerner/DSAlign/align/search.py", line 88, in find_best
for i, ngram in enumerate(ngrams(' ' + look_for + ' ', 3)):
RuntimeError: generator raised StopIteration
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/people/lerner/DSAlign/align/align.py", line 653, in <module>
main()
File "/people/lerner/DSAlign/align/align.py", line 639, in main
total=len(to_align)):
File "/people/lerner/DSAlign/venv/lib/python3.7/site-packages/tqdm/std.py", line 1081, in __iter__
for obj in iterable:
File "/people/lerner/anaconda3/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
RuntimeError: generator raised StopIteration
Aligning: 0%| | 0/1 [00:00<?, ?it/s]
Thanks in advance for your help, this tool looks very promising :)
Hello, After launching this command :
(DSAlign) (deepspeech-gpu-venv) lerner@m148:~/DSAlign$ bin/align.sh --output-max-cer 15 --loglevel 10 --audio data/test2/asyoulikeit_0_shakespeare_64kb.mp3 --script data/test2/transcript.script --aligned data/test2/aligned.json --tlog data/test2/tlog.tlog --stt-workers 1 --stt-model-dir deepspeech-0.5.1-models --stt-no-own-lm
The text cleaning STT went fine, however, when the alignment starts I have a
StopIteration
error in filealign.py
. Complete output :Thanks in advance for your help, this tool looks very promising :)