Open marcverhagen opened 2 years ago
For the JSON default we want to make it look like this
"transcript": [
[
"Hello, this is Jim Lehrer with the NewsHour on PBS.",
5500,
11467
],
[
"We have exciting news about the tomato & Florida.",
12345,
18987
]
]
Much of this is done in e02c422, but:
Now the output just shows something like this
And for full Kaldi output we also get a bunch of timeframes:
But these timeframes are not connected to any text sequences (basically because they were found for the wrong reason and the script assumes they are significant like the output of the segmenter).
Instead, we need something like
Want to introduce an option that governs the granularity at which we link, above it is sentences, but if we have no information on that we degrade to the token level (if we have no results from fastpunct and spaCy), and we we don't have that either we cannot give any alignments and we end up with