-
I'm creating a dataset and need each segment to be a sentence, no cutoffs. I can do this with forced alignment with whisperx, is it possible to somehow get it working with this version? The speed woul…
-
Currently, I am exploring how to use `faster-whisper` for performing forced-alignment between audio and ground-truth transcription texts. I found `WhisperModel.find_alignment` available for this purpo…
-
### 🚀 The feature
Consider on-boarding aligner from [Huang et al., Less Peaky and More Accurate CTC Forced Alignment by Label Priors](https://arxiv.org/abs/2406.02560) (@huangruizhe) to the existin…
-
-
Enabling vectorization (see https://github.com/nod-ai/iree-amd-aie/pull/789) for convolution results in numerical failure. The values are off only slightly (although they are definitely not correct, t…
-
I'd like to get timestamps for each word in my transcript. Is it possible with `speech_recognition`?
qo4on updated
3 years ago
-
* could be done with separate align.php script that calls relevant aligner command if aligner installed
* has to take into account language, so that correct models are used if available
-
How to set parameters similar to `skip_special_tokens` when generating ASR results? Additionally, does it support ASR results at the timestamp level?
-
ClangIR and Vanilla LLVM have the following diff in the generated IR:
```
// LLVM-LABEL: @literals
-// LLVM: global %struct.anon {
+// LLVM: global %struct.anon.1 {
// LLVM: [10 x i8] c"1\…
-
Thank you so much for the work you have done in your tacotron implementation. I have a question if you may.
I have a speech corpus with time alignments. For each audio sample, I have a file that loo…