jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper
MIT License
1.59k stars 176 forks source link

Regrouping Issue/Question #261

Closed LexiconGap closed 11 months ago

LexiconGap commented 11 months ago

Sorry in advance, I'm probably misunderstanding the regroup feature. I'm testing the regroup feature but it doesn't seem to work how I think it should on even basic strings.

I'm assuming the following code:

model = stable_whisper.load_model('medium')
result = model.transcribe('subs.aac', regroup='sp=, /!/! ')
result.to_srt_vtt('test.srt', word_level=False)

should split up the string "Oh, is something up there?" into "Oh," and "is something up there?". It doesn't. Perhaps I misread something in the documentation but I can't seem to come up with my own regrouping methods.

jianfch commented 11 months ago

The space after , belong to the next word so it should be sp=,* instead of sp=, but just sp=, should work too.

LexiconGap commented 11 months ago

Change made, issue persists

model = stable_whisper.load_model('medium')
result = model.transcribe('subs.aac', regroup='sp=,* /,/!/! ')
result.to_srt_vtt('test.srt', word_level=False)
jianfch commented 11 months ago

Change made, issue persists

Can you save the result as JSON and share it?

LexiconGap commented 11 months ago

Here you go test.json

jianfch commented 11 months ago

It should be fixed in d51edb6ad86b06f4582f4c06fcf8a4b6dc8e0bca (2.13.7).

LexiconGap commented 11 months ago

Awesome! I'll spend tomorrow actually learning how to use github and test to see if the problem is resolved I'm assuming you probably verified it via the json file. thank you!

jianfch commented 11 months ago

The fix should available via PyPI with pip install -U stable-ts.

LexiconGap commented 11 months ago

Amazing! Tested. Issue appears to be resolved.