-
Hi @hirofumi0810 ,
I'm trying to replicate the results of the LAS model that you shared in this [table](https://github.com/hirofumi0810/neural_sp#csj-wer).
I'm using the default [run.sh](https:…
-
The new [MusicLM](https://arxiv.org/abs/2301.11325) relies on an audio CLIP named [MuLaN](https://arxiv.org/abs/2208.12415)
I will build out an initial implementation [here](https://github.com/luci…
-
## リソース
+ kaggleのGPU/TPU
+ colabpro
+ ローカルマシン(GTX1070)
+ 基本これでモデルを作ってる 15h/1fold
## 方針
全データ使用、CNNモデルでclassificationのアプローチ
## 0.62でやってたこと
[Code](https://www.kaggle.com/teyosan1229/b…
-
Hi all,
Following up on the tutorials thread [here](https://github.com/lhotse-speech/lhotse/issues/618#issuecomment-1564641754), I've written a first draft of a tutorial for using Lhotse with PyTor…
-
Hi, @usimarit
In 'datasets/asr_dataset.py' line 141 u called line 41 of 'augmentations/augmentation.py', which is calling
self.signal_augmentations = self.parse(config.pop("signal_augment", {}))
…
-
After a performant forced alignment pipeline is done, my next thought goes to how to add gender and accent recognition.
First of all, I will assume that each segment output by the forced aligner co…
-
### Description
We used the ["ASR with Transformer" colab notebook](https://colab.research.google.com/github/tensorflow/tensor2tensor/blob/master/tensor2tensor/notebooks/asr_transformer.ipynb) which …
-
I tried to reproduce the training of the fr-en simultaneous model. I follows the instruction to prepare the dataset and run the script train.simul-s2st.sh
The model training seems to go fine but the …
-
Hi Khaled,
Could you please point me to where normalization is applied to inputs? (for the esc50 case or any other cases)
I am talking about channels mean and std such as written in the code bel…
-
Currently, I am trying to build a transducer-stateless recipe based on Tedlium3 for icefall. This is the PR. (https://github.com/k2-fsa/icefall/pull/183). This PR shows the concrete codes for processi…