Open Josscii opened 5 months ago
It would be very useful to get TranscriptionSegment indeed, with the word timestamps when available. Currently the TranscriptionProgress text contains raw strings such as <|startoftranscript|><|pl|><|transcribe|><|0.00|> Jeżeli zastanawiajcie się
which isn't easy to parse.
We're currently not building the segments before a window completes, but it may be possible to have it return when we see two timestamp tokens surrounding text come through. Would you prefer a separate callback for this, or a configurable parameter on the existing callback eg. callbackInterval: .token
or callbackInterval: .segment
?
What's the relationship of the TranscriptionProgress callback and the TranscriptionSegment callback?
Will they callback at the same time? If so, may be merged into one. If not, may be separated.
I would prefer a separate callback that returns TranscriptionSegment
structs.
during file transcribe, it is more convenient if we can get callback of a TranscriptionSegment, the TranscriptionProgress is not that helpful.