Closed ivankot88 closed 7 months ago
Hi, i run your code on different audios and faced up with this error:
Traceback (most recent call last):
File "/servant/jobs/resources/bt1llfnt03pk6lm727mq/diarize.py", line 192, in <module>
main(args)
File "/servant/jobs/resources/bt1llfnt03pk6lm727mq/diarize.py", line 143, in main
transcription.generate_word_timestamps(result_aligned["word_segments"])
File "/servant/jobs/resources/bt1llfnt03pk6lm727mq/models.py", line 30, in generate_word_timestamps
self.word_timestamps = filter_missing_timestamps(result_aligned)
File "/servant/jobs/resources/bt1llfnt03pk6lm727mq/helpers.py", line 382, in filter_missing_timestamps
ws["end"] = _get_next_start_timestamp(word_timestamps, i)
File "/servant/jobs/resources/bt1llfnt03pk6lm727mq/helpers.py", line 353, in _get_next_start_timestamp
if word_timestamps[next_word_index].get("start") is None:
IndexError: list index out of range
When I looked the structure of the _get_next_start_timestamp
function, I found that the while condition never works.
Hello, this fix introduces other errors when the last word also has no timestamps, i've fixed this and other issues at 9c0ab3c, can you check?
Hello @MahmoudAshraf97 ! Thank you for all your work on this project.
This didn't seem to be worth a pull request, but I can put one in if you'd like re: 9c0ab3c1882d5481628e9b0471b0d4e646f31f2e
As committed there is some misalignment between the naming of initial_offset
( diarize.py, diarize_parallel.py ) and initial_timestamp
helpers.py. I chose to fix this by changing diarize.py and diarize_parallel.py to rename the parameter to initial_timestamp
to maintain consistency with final_timestamp
.
Thanks for noticing that @jonsampson , fixed in 39572386eb4170fc16440b770666f23ccf9bdc80
Hello @MahmoudAshraf97, yes, you're right. Thank you for fix this problem.
Fix function
_get_next_start_timestamp
: