Closed LuluW8071 closed 1 week ago
Network Architecture: 1 CNN1D -> MLP -> 2 LSTM
Started to decode after changing MelSpec Transform to (sample_rate=16000, n_mels=128, hop_length=350, n_fft=1024)
Performed better on 3e-4 learning rate
Changed to BiLSTM for faster convergence
A Base Model for ASR