yl4579 / AuxiliaryASR

Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)
MIT License
111 stars 30 forks source link

get error #7

Closed MMMMichaelzhang closed 2 years ago

MMMMichaelzhang commented 2 years ago

[train]: 24%|██▍ | 16/66 [00:04<00:15, 3.20it/s] Traceback (most recent call last): File "/home/mike/PycharmProjects/AuxiliaryASR/train.py", line 116, in main() File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/click/core.py", line 1130, in call return self.main(args, kwargs) File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/click/core.py", line 760, in invoke return __callback(args, **kwargs) File "/home/mike/PycharmProjects/AuxiliaryASR/train.py", line 98, in main train_results = trainer._train_epoch() File "/home/mike/PycharmProjects/AuxiliaryASR/trainer.py", line 186, in _train_epoch for train_steps_per_epoch, batch in enumerate(tqdm(self.train_dataloader, desc="[train]"), 1): File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/tqdm/std.py", line 1195, in iter for obj in iterable: File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next data = self._next_data() File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data return self._process_data(data) File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data data.reraise() File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise raise exception ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/mike/PycharmProjects/AuxiliaryASR/meldataset.py", line 60, in getitem wave, text_tensor, speaker_id = self._load_tensor(data) File "/home/mike/PycharmProjects/AuxiliaryASR/meldataset.py", line 78, in _load_tensor speaker_id = int(speaker_id) ValueError: invalid literal for int() with base 10: ''

my train_list : /media/mike/yys/data_asr/SSB00800056.wav|wo men can jia guo xu duo zhong da huo dong de biao yan|0 /media/mike/yys/data_asr/SSB00050001.wav|guang zhou nv da xue sheng deng shan shi lian si tian jing fang zhao dao yi si nv shi|0 /media/mike/yys/data_asr/SSB00050002.wav|zhun zhong ke xue gui lv de yao qiu|0 /media/mike/yys/data_asr/SSB00050003.wav|qi lu wu ren shou piao|0 ..

MMMMichaelzhang commented 2 years ago

meldataset.py:

coding: utf-8

import os import os.path as osp import time import random import numpy as np import random import soundfile as sf

import torch from torch import nn import torch.nn.functional as F import torchaudio from torch.utils.data import DataLoader

from g2pM import G2pM

import logging logger = logging.getLogger(name) logger.setLevel(logging.DEBUG) from text_utils import TextCleaner np.random.seed(1) random.seed(1) DEFAULT_DICT_PATH = osp.join(osp.dirname(file), 'word_index_dict.txt') #'kata_dict.csv') # SPECT_PARAMS = { "n_fft": 2048, "win_length": 1200, "hop_length": 300 } MEL_PARAMS = { "n_mels": 80, }

class MelDataset(torch.utils.data.Dataset): def init(self, data_list, dict_path=DEFAULT_DICT_PATH, sr=24000 ):

    spect_params = SPECT_PARAMS
    mel_params = MEL_PARAMS

    _data_list = [l[:-1].split('|') for l in data_list]
    self.data_list = [data if len(data) == 3 else (*data, 0) for data in _data_list]
    self.text_cleaner = TextCleaner(dict_path)
    self.sr = sr

    self.to_melspec = torchaudio.transforms.MelSpectrogram(**MEL_PARAMS)
    self.mean, self.std = -4, 4

    self.g2p = G2pM()

def __len__(self):
    return len(self.data_list)

def __getitem__(self, idx):
    data = self.data_list[idx]
    wave, text_tensor, speaker_id = self._load_tensor(data)
    wave_tensor = torch.from_numpy(wave).float()
    mel_tensor = self.to_melspec(wave_tensor)

    if (text_tensor.size(0)+1) >= (mel_tensor.size(1) // 3):
        mel_tensor = F.interpolate(
            mel_tensor.unsqueeze(0), size=(text_tensor.size(0)+1)*3, align_corners=False,
            mode='linear').squeeze(0)

    acoustic_feature = (torch.log(1e-5 + mel_tensor) - self.mean)/self.std

    length_feature = acoustic_feature.size(1)
    acoustic_feature = acoustic_feature[:, :(length_feature - length_feature % 2)]

    return wave_tensor, acoustic_feature, text_tensor, data[0]

def _load_tensor(self, data):
    wave_path, text, speaker_id = data
    speaker_id = int(speaker_id)

    wave, sr = sf.read(wave_path)

    # phonemize the text
    ps = self.g2p(text.replace('-', ' '))
    if "'" in ps:
        ps.remove("'")
    text = self.text_cleaner(ps)
    blank_index = self.text_cleaner.word_index_dictionary[" "]
    text.insert(0, blank_index) # add a blank at the beginning (silence)
    text.append(blank_index) # add a blank at the end (silence)

    text = torch.LongTensor(text)

    return wave, text, speaker_id

class Collater(object): """ Args: return_wave (bool): if true, will return the wave data along with spectrogram. """

def __init__(self, return_wave=False):
    self.text_pad_index = 0
    self.return_wave = return_wave

def __call__(self, batch):
    batch_size = len(batch)

    # sort by mel length
    lengths = [b[1].shape[1] for b in batch]
    batch_indexes = np.argsort(lengths)[::-1]
    batch = [batch[bid] for bid in batch_indexes]

    nmels = batch[0][1].size(0)
    max_mel_length = max([b[1].shape[1] for b in batch])
    max_text_length = max([b[2].shape[0] for b in batch])

    mels = torch.zeros((batch_size, nmels, max_mel_length)).float()
    texts = torch.zeros((batch_size, max_text_length)).long()
    input_lengths = torch.zeros(batch_size).long()
    output_lengths = torch.zeros(batch_size).long()
    paths = ['' for _ in range(batch_size)]
    for bid, (_, mel, text, path) in enumerate(batch):
        mel_size = mel.size(1)
        text_size = text.size(0)
        mels[bid, :, :mel_size] = mel
        texts[bid, :text_size] = text
        input_lengths[bid] = text_size
        output_lengths[bid] = mel_size
        paths[bid] = path
        assert(text_size < (mel_size//2))

    if self.return_wave:
        waves = [b[0] for b in batch]
        return texts, input_lengths, mels, output_lengths, paths, waves

    return texts, input_lengths, mels, output_lengths

def build_dataloader(path_list, validation=False, batch_size=4, num_workers=1, device='cpu', collate_config={}, dataset_config={}):

dataset = MelDataset(path_list, **dataset_config)
collate_fn = Collater(**collate_config)
data_loader = DataLoader(dataset,
                         batch_size=batch_size,
                         shuffle=(not validation),
                         num_workers=num_workers,
                         drop_last=(not validation),
                         collate_fn=collate_fn,
                         pin_memory=(device != 'cpu'))

return data_loader
Kristopher-Chen commented 2 years ago

hi, what does your dict table for Mandarin look like?

MMMMichaelzhang commented 2 years ago

word_index_dict.txt @Kristopher-Chen

Kristopher-Chen commented 2 years ago

wo men can jia guo xu duo zhong da huo dong de biao yan

Thank you! BTW, if the input to GP2M is pinyin, it seems the output is also pinyin. How will it be changed to phonemes in the dict?

MMMMichaelzhang commented 2 years ago

word_index_dict.txt maybe like this? @Kristopher-Chen

Kristopher-Chen commented 2 years ago

word_index_dict.txt maybe like this? @Kristopher-Chen This may be one way for training, the input and dict both in pinyin format. I just wonder if this is similar to the results of the author's or to the way trained in Chinese characters, in the meaning of encoder output. It seems, for Chinese ASR, Chinese character, or pinyin, or phonemes are all acceptable for training. Maybe the experienced guys may tell the difference. @yl4579 Is it ok if we train by pinyin, or better by phonemes ? If by phonemes , is there any tool to directly convert pinyin into phonemes?

MMMMichaelzhang commented 2 years ago

how to change meldataset.py?I still got error. @Kristopher-Chen

Kristopher-Chen commented 2 years ago

speaker_id = int(speaker_id)

it seems there is something wrong with your speaker label in the train list...

MMMMichaelzhang commented 2 years ago

I change it like this:

def _load_tensor(self, data): wave_path, text, speaker_id = data speaker_id = 0 wave, sr = sf.read(wave_path)

    # phonemize the text
    ps = text.split(" ")
    if "'" in ps:
        ps.remove("'")
    text = self.text_cleaner(ps)
    blank_index = self.text_cleaner.word_index_dictionary[" "]
    text.insert(0, blank_index)  # add a blank at the beginning (silence)
    text.append(blank_index)  # add a blank at the end (silence)
    text = torch.LongTensor(text)

    return wave, text, speaker_id

I didnt use g2p,only make txt into an array,like this:['zhi', 'ye', 'lian','sai',....] then I run train.py,I got lots of nan..

c/home/mike/anaconda3/envs/asr/bin/python /home/mike/PycharmProjects/AuxiliaryASR/train.py {'max_lr': 0.0005, 'pct_start': 0.0, 'epochs': 200, 'steps_per_epoch': 5} [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan /home/mike/PycharmProjects/AuxiliaryASR/trainer.py:158: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). mel_input_length = mel_input_length // (2 self.model.n_down) [train]: 100%|██████████| 5/5 [00:02<00:00, 2.19it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan /home/mike/PycharmProjects/AuxiliaryASR/trainer.py:203: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). mel_input_length = mel_input_length // (2 self.model.n_down) [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.65it/s] --- epoch 1 --- train/loss : 75.0535 train/ctc : 69.0236 train/s2s : 6.0298 train/learning_rate: 0.0005 eval/ctc : 6.2143 eval/s2s : 5.7734 eval/loss : 11.9877 eval/wer : 0.9154 eval/acc : 0.0681 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.62it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s] --- epoch 2 --- train/loss : 11.0505 train/ctc : 5.3216 train/s2s : 5.7289 train/learning_rate: 0.0005 eval/ctc : 5.7357 eval/s2s : 5.4435 eval/loss : 11.1791 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.55it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 3 --- train/loss : 10.1859 train/ctc : 4.7313 train/s2s : 5.4546 train/learning_rate: 0.0005 eval/ctc : 4.5162 eval/s2s : 5.2624 eval/loss : 9.7787 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.61it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.89it/s] --- epoch 4 --- train/loss : 9.6173 train/ctc : 4.3016 train/s2s : 5.3157 train/learning_rate: 0.0005 eval/ctc : 4.5265 eval/s2s : 5.1128 eval/loss : 9.6394 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.23it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s] --- epoch 5 --- train/loss : 9.1252 train/ctc : 3.9722 train/s2s : 5.1530 train/learning_rate: 0.0005 eval/ctc : 4.1449 eval/s2s : 5.0081 eval/loss : 9.1530 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.40it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 6 --- train/loss : 8.8360 train/ctc : 3.7944 train/s2s : 5.0416 train/learning_rate: 0.0005 eval/ctc : 3.9801 eval/s2s : 4.9309 eval/loss : 8.9109 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.47it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 7 --- train/loss : 8.5783 train/ctc : 3.6231 train/s2s : 4.9552 train/learning_rate: 0.0005 eval/ctc : 3.9631 eval/s2s : 4.8768 eval/loss : 8.8399 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s] --- epoch 8 --- train/loss : 8.3889 train/ctc : 3.5022 train/s2s : 4.8867 train/learning_rate: 0.0005 eval/ctc : 3.7899 eval/s2s : 4.8400 eval/loss : 8.6300 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.60it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 9 --- train/loss : 8.2826 train/ctc : 3.4358 train/s2s : 4.8468 train/learning_rate: 0.0005 eval/ctc : 3.7411 eval/s2s : 4.8128 eval/loss : 8.5539 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.38it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s] --- epoch 10 --- train/loss : 8.2358 train/ctc : 3.4074 train/s2s : 4.8285 train/learning_rate: 0.0005 eval/ctc : 3.8239 eval/s2s : 4.7924 eval/loss : 8.6163 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.97it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s] --- epoch 11 --- train/loss : 8.1843 train/ctc : 3.3910 train/s2s : 4.7933 train/learning_rate: 0.0005 eval/ctc : 3.7246 eval/s2s : 4.7765 eval/loss : 8.5012 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.89it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.58it/s] --- epoch 12 --- train/loss : 8.1550 train/ctc : 3.3807 train/s2s : 4.7743 train/learning_rate: 0.0005 eval/ctc : 3.8298 eval/s2s : 4.7562 eval/loss : 8.5860 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.42it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s] --- epoch 13 --- train/loss : 8.1364 train/ctc : 3.3905 train/s2s : 4.7459 train/learning_rate: 0.0005 eval/ctc : 3.7493 eval/s2s : 4.7559 eval/loss : 8.5052 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.09it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 14 --- train/loss : 8.0515 train/ctc : 3.3432 train/s2s : 4.7083 train/learning_rate: 0.0005 eval/ctc : 3.7520 eval/s2s : 4.7024 eval/loss : 8.4544 eval/wer : 0.9156 eval/acc : 0.1737 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s] --- epoch 15 --- train/loss : 8.0855 train/ctc : 3.3798 train/s2s : 4.7057 train/learning_rate: 0.0005 eval/ctc : 3.7498 eval/s2s : 4.6981 eval/loss : 8.4480 eval/wer : 0.9156 eval/acc : 0.1746 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.03it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 16 --- train/loss : 8.1962 train/ctc : 3.4610 train/s2s : 4.7351 train/learning_rate: 0.0005 eval/ctc : 3.6114 eval/s2s : 4.7435 eval/loss : 8.3549 eval/wer : 0.9156 eval/acc : 0.1761 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 17 --- train/loss : 8.1676 train/ctc : 3.4286 train/s2s : 4.7390 train/learning_rate: 0.0005 eval/ctc : 3.7043 eval/s2s : 4.7304 eval/loss : 8.4346 eval/wer : 0.9156 eval/acc : 0.1674 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 18 --- train/loss : 8.0775 train/ctc : 3.3791 train/s2s : 4.6983 train/learning_rate: 0.0005 eval/ctc : 3.8197 eval/s2s : 4.6896 eval/loss : 8.5093 eval/wer : 0.9156 eval/acc : 0.1685 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.69it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 19 --- train/loss : 7.9888 train/ctc : 3.3347 train/s2s : 4.6541 train/learning_rate: 0.0005 eval/ctc : 3.8478 eval/s2s : 4.6834 eval/loss : 8.5312 eval/wer : 0.9156 eval/acc : 0.1761 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 20 --- train/loss : 7.9516 train/ctc : 3.3236 train/s2s : 4.6280 train/learning_rate: 0.0005 eval/ctc : 3.7614 eval/s2s : 4.6617 eval/loss : 8.4231 eval/wer : 0.9156 eval/acc : 0.1837 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.57it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s] --- epoch 21 --- train/loss : 7.9785 train/ctc : 3.3378 train/s2s : 4.6407 train/learning_rate: 0.0005 eval/ctc : 3.6597 eval/s2s : 4.7463 eval/loss : 8.4060 eval/wer : 0.9156 eval/acc : 0.1789 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s] --- epoch 22 --- train/loss : 7.9172 train/ctc : 3.3130 train/s2s : 4.6042 train/learning_rate: 0.0005 eval/ctc : 3.6189 eval/s2s : 4.6734 eval/loss : 8.2923 eval/wer : 0.9156 eval/acc : 0.1833 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 23 --- train/loss : 7.8904 train/ctc : 3.2974 train/s2s : 4.5929 train/learning_rate: 0.0005 eval/ctc : 3.6237 eval/s2s : 4.6622 eval/loss : 8.2859 eval/wer : 0.9156 eval/acc : 0.1892 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 24 --- train/loss : 7.8784 train/ctc : 3.2912 train/s2s : 4.5872 train/learning_rate: 0.0005 eval/ctc : 3.5953 eval/s2s : 4.6667 eval/loss : 8.2620 eval/wer : 0.9156 eval/acc : 0.1817 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.77it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s] --- epoch 25 --- train/loss : 7.8672 train/ctc : 3.2971 train/s2s : 4.5701 train/learning_rate: 0.0005 eval/ctc : 3.6394 eval/s2s : 4.6323 eval/loss : 8.2717 eval/wer : 0.9156 eval/acc : 0.1903 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.98it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s] --- epoch 26 --- train/loss : 7.8857 train/ctc : 3.3115 train/s2s : 4.5742 train/learning_rate: 0.0005 eval/ctc : 3.7269 eval/s2s : 4.6432 eval/loss : 8.3700 eval/wer : 0.9156 eval/acc : 0.1866 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 27 --- train/loss : 7.8556 train/ctc : 3.3006 train/s2s : 4.5550 train/learning_rate: 0.0005 eval/ctc : 3.6610 eval/s2s : 4.6692 eval/loss : 8.3302 eval/wer : 0.9156 eval/acc : 0.1872 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.24it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s] --- epoch 28 --- train/loss : 7.8417 train/ctc : 3.3064 train/s2s : 4.5353 train/learning_rate: 0.0005 eval/ctc : 3.7343 eval/s2s : 4.8558 eval/loss : 8.5901 eval/wer : 0.9156 eval/acc : 0.1804 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s] --- epoch 29 --- train/loss : 7.8900 train/ctc : 3.3365 train/s2s : 4.5535 train/learning_rate: 0.0005 eval/ctc : 3.6595 eval/s2s : 4.7935 eval/loss : 8.4530 eval/wer : 0.9156 eval/acc : 0.1841 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s] --- epoch 30 --- train/loss : 7.8213 train/ctc : 3.3108 train/s2s : 4.5105 train/learning_rate: 0.0005 eval/ctc : 3.6763 eval/s2s : 4.7761 eval/loss : 8.4524 eval/wer : 0.9156 eval/acc : 0.1862 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.83it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 31 --- train/loss : 7.7976 train/ctc : 3.2952 train/s2s : 4.5025 train/learning_rate: 0.0005 eval/ctc : 3.9837 eval/s2s : 4.7439 eval/loss : 8.7276 eval/wer : 0.9156 eval/acc : 0.1514 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s] --- epoch 32 --- train/loss : 7.9222 train/ctc : 3.3169 train/s2s : 4.6052 train/learning_rate: 0.0005 eval/ctc : 3.7498 eval/s2s : 4.9242 eval/loss : 8.6741 eval/wer : 0.9156 eval/acc : 0.1760 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.47it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s] --- epoch 33 --- train/loss : 7.9178 train/ctc : 3.3398 train/s2s : 4.5780 train/learning_rate: 0.0005 eval/ctc : 3.8901 eval/s2s : 4.6694 eval/loss : 8.5595 eval/wer : 0.9156 eval/acc : 0.1801 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 34 --- train/loss : 7.8355 train/ctc : 3.3211 train/s2s : 4.5144 train/learning_rate: 0.0005 eval/ctc : 3.6889 eval/s2s : 4.6956 eval/loss : 8.3845 eval/wer : 0.9156 eval/acc : 0.1798 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.09it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s] --- epoch 35 --- train/loss : 7.8038 train/ctc : 3.3026 train/s2s : 4.5013 train/learning_rate: 0.0005 eval/ctc : 3.8723 eval/s2s : 4.8571 eval/loss : 8.7294 eval/wer : 0.9156 eval/acc : 0.1779 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 36 --- train/loss : 7.8076 train/ctc : 3.3171 train/s2s : 4.4906 train/learning_rate: 0.0005 eval/ctc : 4.0461 eval/s2s : 4.8800 eval/loss : 8.9261 eval/wer : 0.9156 eval/acc : 0.1769 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s] --- epoch 37 --- train/loss : 7.8130 train/ctc : 3.3149 train/s2s : 4.4981 train/learning_rate: 0.0005 eval/ctc : 3.7968 eval/s2s : 4.7285 eval/loss : 8.5253 eval/wer : 0.9156 eval/acc : 0.1841 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s] --- epoch 38 --- train/loss : 7.7021 train/ctc : 3.2687 train/s2s : 4.4333 train/learning_rate: 0.0005 eval/ctc : 3.9418 eval/s2s : 4.7848 eval/loss : 8.7266 eval/wer : 0.9156 eval/acc : 0.1815 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 39 --- train/loss : 7.6667 train/ctc : 3.2566 train/s2s : 4.4101 train/learning_rate: 0.0005 eval/ctc : 4.1770 eval/s2s : 4.9316 eval/loss : 9.1086 eval/wer : 0.9156 eval/acc : 0.1807 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 40 --- train/loss : 7.6706 train/ctc : 3.2393 train/s2s : 4.4312 train/learning_rate: 0.0005 eval/ctc : 4.1531 eval/s2s : 4.7729 eval/loss : 8.9260 eval/wer : 0.9156 eval/acc : 0.1825 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s] --- epoch 41 --- train/loss : 7.7245 train/ctc : 3.3017 train/s2s : 4.4228 train/learning_rate: 0.0004 eval/ctc : 3.8345 eval/s2s : 4.7370 eval/loss : 8.5714 eval/wer : 0.9156 eval/acc : 0.1828 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.25it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s] --- epoch 42 --- train/loss : 7.8338 train/ctc : 3.3723 train/s2s : 4.4615 train/learning_rate: 0.0004 eval/ctc : 4.0506 eval/s2s : 4.8532 eval/loss : 8.9038 eval/wer : 0.9156 eval/acc : 0.1833 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.35it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 43 --- train/loss : 7.6717 train/ctc : 3.2835 train/s2s : 4.3882 train/learning_rate: 0.0004 eval/ctc : 4.1455 eval/s2s : 4.7469 eval/loss : 8.8924 eval/wer : 0.9156 eval/acc : 0.1846 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.83it/s] --- epoch 44 --- train/loss : 7.6108 train/ctc : 3.2429 train/s2s : 4.3679 train/learning_rate: 0.0004 eval/ctc : 3.9913 eval/s2s : 4.9418 eval/loss : 8.9331 eval/wer : 0.9156 eval/acc : 0.1800 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 45 --- train/loss : 7.5969 train/ctc : 3.2278 train/s2s : 4.3691 train/learning_rate: 0.0004 eval/ctc : 4.0990 eval/s2s : 4.8902 eval/loss : 8.9892 eval/wer : 0.9156 eval/acc : 0.1805 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s] --- epoch 46 --- train/loss : 7.6923 train/ctc : 3.2903 train/s2s : 4.4020 train/learning_rate: 0.0004 eval/ctc : 4.0601 eval/s2s : 4.9885 eval/loss : 9.0486 eval/wer : 0.9156 eval/acc : 0.1810 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 47 --- train/loss : 7.6176 train/ctc : 3.2344 train/s2s : 4.3832 train/learning_rate: 0.0004 eval/ctc : 4.1990 eval/s2s : 4.9712 eval/loss : 9.1702 eval/wer : 0.9156 eval/acc : 0.1801 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.89it/s] --- epoch 48 --- train/loss : 7.7549 train/ctc : 3.3699 train/s2s : 4.3850 train/learning_rate: 0.0004 eval/ctc : 3.5383 eval/s2s : 4.6619 eval/loss : 8.2002 eval/wer : 0.9156 eval/acc : 0.1878 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 49 --- train/loss : 7.7524 train/ctc : 3.3666 train/s2s : 4.3858 train/learning_rate: 0.0004 eval/ctc : 3.7864 eval/s2s : 4.6288 eval/loss : 8.4152 eval/wer : 0.9156 eval/acc : 0.1858 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.57it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 50 --- train/loss : 7.6992 train/ctc : 3.3198 train/s2s : 4.3793 train/learning_rate: 0.0004 eval/ctc : 3.7948 eval/s2s : 4.7639 eval/loss : 8.5587 eval/wer : 0.9156 eval/acc : 0.1833 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.73it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s] --- epoch 51 --- train/loss : 7.6929 train/ctc : 3.3021 train/s2s : 4.3908 train/learning_rate: 0.0004 eval/ctc : 3.7772 eval/s2s : 4.6957 eval/loss : 8.4729 eval/wer : 0.9156 eval/acc : 0.1862 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.72it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s] --- epoch 52 --- train/loss : 7.6130 train/ctc : 3.2714 train/s2s : 4.3417 train/learning_rate: 0.0004 eval/ctc : 3.8054 eval/s2s : 4.6372 eval/loss : 8.4425 eval/wer : 0.9156 eval/acc : 0.1883 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.54it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s] --- epoch 53 --- train/loss : 7.6414 train/ctc : 3.2906 train/s2s : 4.3508 train/learning_rate: 0.0004 eval/ctc : 3.9416 eval/s2s : 4.7329 eval/loss : 8.6745 eval/wer : 0.9156 eval/acc : 0.1862 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.37it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 54 --- train/loss : 7.5984 train/ctc : 3.2760 train/s2s : 4.3224 train/learning_rate: 0.0004 eval/ctc : 4.0414 eval/s2s : 4.7898 eval/loss : 8.8312 eval/wer : 0.9156 eval/acc : 0.1853 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.17it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.48it/s] --- epoch 55 --- train/loss : 7.6170 train/ctc : 3.2708 train/s2s : 4.3462 train/learning_rate: 0.0004 eval/ctc : 4.1231 eval/s2s : 4.8225 eval/loss : 8.9455 eval/wer : 0.9156 eval/acc : 0.1840 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.92it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.61it/s] --- epoch 56 --- train/loss : 7.5712 train/ctc : 3.2546 train/s2s : 4.3166 train/learning_rate: 0.0004 eval/ctc : 4.1421 eval/s2s : 4.8496 eval/loss : 8.9917 eval/wer : 0.9156 eval/acc : 0.1837 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.66it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s] --- epoch 57 --- train/loss : 7.5125 train/ctc : 3.2390 train/s2s : 4.2734 train/learning_rate: 0.0004 eval/ctc : 4.1720 eval/s2s : 4.8073 eval/loss : 8.9792 eval/wer : 0.9156 eval/acc : 0.1837 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.24it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 58 --- train/loss : 7.5394 train/ctc : 3.2512 train/s2s : 4.2881 train/learning_rate: 0.0004 eval/ctc : 4.2439 eval/s2s : 4.8307 eval/loss : 9.0746 eval/wer : 0.9156 eval/acc : 0.1851 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.48it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s] --- epoch 59 --- train/loss : 7.4915 train/ctc : 3.2168 train/s2s : 4.2747 train/learning_rate: 0.0004 eval/ctc : 4.2920 eval/s2s : 4.8120 eval/loss : 9.1040 eval/wer : 0.9156 eval/acc : 0.1844 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.15it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.66it/s] --- epoch 60 --- train/loss : 7.5041 train/ctc : 3.2265 train/s2s : 4.2776 train/learning_rate: 0.0004 eval/ctc : 4.0227 eval/s2s : 4.7344 eval/loss : 8.7571 eval/wer : 0.9156 eval/acc : 0.1866 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.20it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.62it/s] --- epoch 61 --- train/loss : 7.4721 train/ctc : 3.1990 train/s2s : 4.2731 train/learning_rate: 0.0004 eval/ctc : 4.1542 eval/s2s : 4.7564 eval/loss : 8.9106 eval/wer : 0.9156 eval/acc : 0.1867 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.28it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s] --- epoch 62 --- train/loss : 7.4689 train/ctc : 3.2092 train/s2s : 4.2597 train/learning_rate: 0.0004 eval/ctc : 3.9684 eval/s2s : 4.8199 eval/loss : 8.7883 eval/wer : 0.9156 eval/acc : 0.1853 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.31it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s] --- epoch 63 --- train/loss : 7.4341 train/ctc : 3.1954 train/s2s : 4.2387 train/learning_rate: 0.0004 eval/ctc : 4.0277 eval/s2s : 4.7588 eval/loss : 8.7865 eval/wer : 0.9156 eval/acc : 0.1872 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.30it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.70it/s] --- epoch 64 --- train/loss : 7.5040 train/ctc : 3.2359 train/s2s : 4.2681 train/learning_rate: 0.0004 eval/ctc : 4.0127 eval/s2s : 4.7493 eval/loss : 8.7620 eval/wer : 0.9156 eval/acc : 0.1859 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.45it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 65 --- train/loss : 7.4346 train/ctc : 3.1880 train/s2s : 4.2466 train/learning_rate: 0.0004 eval/ctc : 3.8882 eval/s2s : 4.7089 eval/loss : 8.5971 eval/wer : 0.9156 eval/acc : 0.1835 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 66 --- train/loss : 7.4244 train/ctc : 3.1936 train/s2s : 4.2309 train/learning_rate: 0.0004 eval/ctc : 4.0908 eval/s2s : 4.8293 eval/loss : 8.9200 eval/wer : 0.9156 eval/acc : 0.1856 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.60it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s] --- epoch 67 --- train/loss : 7.4008 train/ctc : 3.1989 train/s2s : 4.2019 train/learning_rate: 0.0004 eval/ctc : 3.8532 eval/s2s : 4.8491 eval/loss : 8.7024 eval/wer : 0.9156 eval/acc : 0.1860 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.55it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s] --- epoch 68 --- train/loss : 7.3786 train/ctc : 3.1684 train/s2s : 4.2103 train/learning_rate: 0.0004 eval/ctc : 4.1781 eval/s2s : 5.0866 eval/loss : 9.2648 eval/wer : 0.9156 eval/acc : 0.1822 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.51it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 69 --- train/loss : 7.3674 train/ctc : 3.1431 train/s2s : 4.2242 train/learning_rate: 0.0004 eval/ctc : 4.0825 eval/s2s : 4.7750 eval/loss : 8.8575 eval/wer : 0.9156 eval/acc : 0.1857 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.44it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 70 --- train/loss : 7.3628 train/ctc : 3.1575 train/s2s : 4.2053 train/learning_rate: 0.0004 eval/ctc : 4.0177 eval/s2s : 4.7277 eval/loss : 8.7454 eval/wer : 0.9156 eval/acc : 0.1857 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.06it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 71 --- train/loss : 7.3362 train/ctc : 3.1444 train/s2s : 4.1918 train/learning_rate: 0.0004 eval/ctc : 3.8727 eval/s2s : 4.6658 eval/loss : 8.5385 eval/wer : 0.9156 eval/acc : 0.1865 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 72 --- train/loss : 7.2773 train/ctc : 3.1114 train/s2s : 4.1659 train/learning_rate: 0.0004 eval/ctc : 3.6477 eval/s2s : 4.7479 eval/loss : 8.3956 eval/wer : 0.9156 eval/acc : 0.1881 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s] --- epoch 73 --- train/loss : 7.3971 train/ctc : 3.1863 train/s2s : 4.2108 train/learning_rate: 0.0004 eval/ctc : 4.3377 eval/s2s : 4.8245 eval/loss : 9.1622 eval/wer : 0.9156 eval/acc : 0.1815 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.59it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 74 --- train/loss : 7.2459 train/ctc : 3.1035 train/s2s : 4.1425 train/learning_rate: 0.0003 eval/ctc : 4.0785 eval/s2s : 4.7461 eval/loss : 8.8246 eval/wer : 0.9156 eval/acc : 0.1865 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.05it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 75 --- train/loss : 7.2715 train/ctc : 3.1017 train/s2s : 4.1698 train/learning_rate: 0.0003 eval/ctc : 3.9213 eval/s2s : 4.7599 eval/loss : 8.6812 eval/wer : 0.9156 eval/acc : 0.1847 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s] --- epoch 76 --- train/loss : 7.3272 train/ctc : 3.1439 train/s2s : 4.1833 train/learning_rate: 0.0003 eval/ctc : 4.1709 eval/s2s : 4.8088 eval/loss : 8.9798 eval/wer : 0.9156 eval/acc : 0.1832 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.88it/s] --- epoch 77 --- train/loss : 7.2035 train/ctc : 3.0685 train/s2s : 4.1350 train/learning_rate: 0.0003 eval/ctc : 4.6577 eval/s2s : 5.0092 eval/loss : 9.6669 eval/wer : 0.9156 eval/acc : 0.1787 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 78 --- train/loss : 7.2384 train/ctc : 3.0983 train/s2s : 4.1401 train/learning_rate: 0.0003 eval/ctc : 4.1970 eval/s2s : 4.9218 eval/loss : 9.1188 eval/wer : 0.9156 eval/acc : 0.1815 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.02it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.86it/s] --- epoch 79 --- train/loss : 7.1406 train/ctc : 3.0216 train/s2s : 4.1190 train/learning_rate: 0.0003 eval/ctc : 4.3503 eval/s2s : 4.9700 eval/loss : 9.3204 eval/wer : 0.9156 eval/acc : 0.1822 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.38it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s] --- epoch 80 --- train/loss : 7.2061 train/ctc : 3.0606 train/s2s : 4.1455 train/learning_rate: 0.0003 eval/ctc : 3.9002 eval/s2s : 4.7365 eval/loss : 8.6367 eval/wer : 0.9156 eval/acc : 0.1858 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.18it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 81 --- train/loss : 7.1768 train/ctc : 3.0608 train/s2s : 4.1159 train/learning_rate: 0.0003 eval/ctc : 4.6743 eval/s2s : 4.8976 eval/loss : 9.5719 eval/wer : 0.9156 eval/acc : 0.1828 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.83it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.70it/s] --- epoch 82 --- train/loss : 7.2171 train/ctc : 3.0902 train/s2s : 4.1269 train/learning_rate: 0.0003 eval/ctc : 3.8073 eval/s2s : 4.8600 eval/loss : 8.6673 eval/wer : 0.9156 eval/acc : 0.1843 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.24it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 83 --- train/loss : 7.2151 train/ctc : 3.0843 train/s2s : 4.1308 train/learning_rate: 0.0003 eval/ctc : 3.7982 eval/s2s : 4.7248 eval/loss : 8.5230 eval/wer : 0.9156 eval/acc : 0.1849 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.44it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s] --- epoch 84 --- train/loss : 7.1503 train/ctc : 3.0318 train/s2s : 4.1185 train/learning_rate: 0.0003 eval/ctc : 3.9267 eval/s2s : 4.7152 eval/loss : 8.6419 eval/wer : 0.9156 eval/acc : 0.1840 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s] --- epoch 85 --- train/loss : 7.1685 train/ctc : 3.0486 train/s2s : 4.1199 train/learning_rate: 0.0003 eval/ctc : 4.0690 eval/s2s : 4.8683 eval/loss : 8.9373 eval/wer : 0.9156 eval/acc : 0.1845 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s] --- epoch 86 --- train/loss : 7.1133 train/ctc : 3.0121 train/s2s : 4.1013 train/learning_rate: 0.0003 eval/ctc : 4.0398 eval/s2s : 4.7599 eval/loss : 8.7997 eval/wer : 0.9156 eval/acc : 0.1828 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.51it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s] --- epoch 87 --- train/loss : 7.0757 train/ctc : 3.0049 train/s2s : 4.0708 train/learning_rate: 0.0003 eval/ctc : 4.2974 eval/s2s : 4.8235 eval/loss : 9.1209 eval/wer : 0.9156 eval/acc : 0.1864 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s] --- epoch 88 --- train/loss : 7.1118 train/ctc : 3.0216 train/s2s : 4.0902 train/learning_rate: 0.0003 eval/ctc : 3.9977 eval/s2s : 4.7862 eval/loss : 8.7840 eval/wer : 0.9156 eval/acc : 0.1844 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 89 --- train/loss : 7.0431 train/ctc : 2.9789 train/s2s : 4.0642 train/learning_rate: 0.0003 eval/ctc : 4.2950 eval/s2s : 4.9520 eval/loss : 9.2470 eval/wer : 0.9156 eval/acc : 0.1828 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.73it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 90 --- train/loss : 7.0664 train/ctc : 2.9918 train/s2s : 4.0746 train/learning_rate: 0.0003 eval/ctc : 4.1555 eval/s2s : 4.8369 eval/loss : 8.9924 eval/wer : 0.9156 eval/acc : 0.1799 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s] --- epoch 91 --- train/loss : 7.0185 train/ctc : 2.9730 train/s2s : 4.0455 train/learning_rate: 0.0003 eval/ctc : 4.0850 eval/s2s : 4.7547 eval/loss : 8.8398 eval/wer : 0.9150 eval/acc : 0.1843 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.73it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 92 --- train/loss : 7.0077 train/ctc : 2.9500 train/s2s : 4.0577 train/learning_rate: 0.0003 eval/ctc : 4.0783 eval/s2s : 4.7477 eval/loss : 8.8260 eval/wer : 0.9156 eval/acc : 0.1851 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 93 --- train/loss : 6.9128 train/ctc : 2.9149 train/s2s : 3.9978 train/learning_rate: 0.0003 eval/ctc : 4.5163 eval/s2s : 4.9429 eval/loss : 9.4592 eval/wer : 0.9116 eval/acc : 0.1814 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.54it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 94 --- train/loss : 7.1016 train/ctc : 3.0004 train/s2s : 4.1013 train/learning_rate: 0.0003 eval/ctc : 4.8001 eval/s2s : 4.9809 eval/loss : 9.7809 eval/wer : 0.9156 eval/acc : 0.1811 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s] --- epoch 95 --- train/loss : 6.9483 train/ctc : 2.9333 train/s2s : 4.0150 train/learning_rate: 0.0003 eval/ctc : 3.8254 eval/s2s : 4.6975 eval/loss : 8.5228 eval/wer : 0.9156 eval/acc : 0.1840 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s] --- epoch 96 --- train/loss : 7.0110 train/ctc : 2.9697 train/s2s : 4.0413 train/learning_rate: 0.0003 eval/ctc : 3.9616 eval/s2s : 4.8604 eval/loss : 8.8220 eval/wer : 0.9115 eval/acc : 0.1797 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.59it/s] --- epoch 97 --- train/loss : 7.0256 train/ctc : 2.9598 train/s2s : 4.0659 train/learning_rate: 0.0003 eval/ctc : 3.9609 eval/s2s : 4.9142 eval/loss : 8.8752 eval/wer : 0.9141 eval/acc : 0.1820 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.92it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.57it/s] --- epoch 98 --- train/loss : 6.9884 train/ctc : 2.9417 train/s2s : 4.0467 train/learning_rate: 0.0003 eval/ctc : 4.1383 eval/s2s : 4.7269 eval/loss : 8.8652 eval/wer : 0.9149 eval/acc : 0.1886 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.60it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 99 --- train/loss : 6.9401 train/ctc : 2.9170 train/s2s : 4.0231 train/learning_rate: 0.0003 eval/ctc : 4.2011 eval/s2s : 4.7642 eval/loss : 8.9653 eval/wer : 0.9154 eval/acc : 0.1825 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.43it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.59it/s] --- epoch 100 --- train/loss : 6.9860 train/ctc : 2.9462 train/s2s : 4.0398 train/learning_rate: 0.0003 eval/ctc : 4.0909 eval/s2s : 4.8165 eval/loss : 8.9075 eval/wer : 0.9149 eval/acc : 0.1837 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.71it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s] --- epoch 101 --- train/loss : 6.8023 train/ctc : 2.8412 train/s2s : 3.9610 train/learning_rate: 0.0002 eval/ctc : 4.1164 eval/s2s : 4.7567 eval/loss : 8.8731 eval/wer : 0.9146 eval/acc : 0.1841 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.53it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 102 --- train/loss : 6.8663 train/ctc : 2.8736 train/s2s : 3.9927 train/learning_rate: 0.0002 eval/ctc : 3.9890 eval/s2s : 4.8344 eval/loss : 8.8234 eval/wer : 0.9051 eval/acc : 0.1835 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.60it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 103 --- train/loss : 6.8254 train/ctc : 2.8415 train/s2s : 3.9839 train/learning_rate: 0.0002 eval/ctc : 4.2138 eval/s2s : 4.8773 eval/loss : 9.0911 eval/wer : 0.9127 eval/acc : 0.1801 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 104 --- train/loss : 6.8487 train/ctc : 2.8615 train/s2s : 3.9872 train/learning_rate: 0.0002 eval/ctc : 4.0311 eval/s2s : 4.7282 eval/loss : 8.7593 eval/wer : 0.9094 eval/acc : 0.1862 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.69it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s] --- epoch 105 --- train/loss : 6.7640 train/ctc : 2.8293 train/s2s : 3.9347 train/learning_rate: 0.0002 eval/ctc : 4.2728 eval/s2s : 4.9799 eval/loss : 9.2526 eval/wer : 0.9094 eval/acc : 0.1806 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s] --- epoch 106 --- train/loss : 6.7261 train/ctc : 2.7815 train/s2s : 3.9446 train/learning_rate: 0.0002 eval/ctc : 4.1878 eval/s2s : 4.7721 eval/loss : 8.9600 eval/wer : 0.9144 eval/acc : 0.1836 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.55it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 107 --- train/loss : 6.7471 train/ctc : 2.8097 train/s2s : 3.9374 train/learning_rate: 0.0002 eval/ctc : 4.1852 eval/s2s : 4.8745 eval/loss : 9.0597 eval/wer : 0.9112 eval/acc : 0.1803 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 108 --- train/loss : 6.7333 train/ctc : 2.7896 train/s2s : 3.9437 train/learning_rate: 0.0002 eval/ctc : 4.3646 eval/s2s : 4.9009 eval/loss : 9.2655 eval/wer : 0.9051 eval/acc : 0.1822 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.42it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s] --- epoch 109 --- train/loss : 6.7074 train/ctc : 2.7849 train/s2s : 3.9225 train/learning_rate: 0.0002 eval/ctc : 4.1075 eval/s2s : 4.8536 eval/loss : 8.9611 eval/wer : 0.9038 eval/acc : 0.1846 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.45it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s] --- epoch 110 --- train/loss : 6.7409 train/ctc : 2.8076 train/s2s : 3.9333 train/learning_rate: 0.0002 eval/ctc : 4.2799 eval/s2s : 4.8573 eval/loss : 9.1372 eval/wer : 0.9024 eval/acc : 0.1843 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.58it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 111 --- train/loss : 6.8031 train/ctc : 2.8327 train/s2s : 3.9703 train/learning_rate: 0.0002 eval/ctc : 4.3046 eval/s2s : 4.8558 eval/loss : 9.1604 eval/wer : 0.9021 eval/acc : 0.1865 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 112 --- train/loss : 6.7090 train/ctc : 2.7768 train/s2s : 3.9322 train/learning_rate: 0.0002 eval/ctc : 4.3296 eval/s2s : 4.9961 eval/loss : 9.3257 eval/wer : 0.9033 eval/acc : 0.1840 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.57it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.66it/s] --- epoch 113 --- train/loss : 6.6278 train/ctc : 2.7155 train/s2s : 3.9123 train/learning_rate: 0.0002 eval/ctc : 4.4864 eval/s2s : 4.7846 eval/loss : 9.2710 eval/wer : 0.9092 eval/acc : 0.1850 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.66it/s] --- epoch 114 --- train/loss : 6.6054 train/ctc : 2.7000 train/s2s : 3.9054 train/learning_rate: 0.0002 eval/ctc : 4.5483 eval/s2s : 4.8994 eval/loss : 9.4477 eval/wer : 0.8950 eval/acc : 0.1827 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s] --- epoch 115 --- train/loss : 6.5971 train/ctc : 2.7080 train/s2s : 3.8892 train/learning_rate: 0.0002 eval/ctc : 4.6213 eval/s2s : 4.9159 eval/loss : 9.5372 eval/wer : 0.9038 eval/acc : 0.1838 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 116 --- train/loss : 6.6083 train/ctc : 2.7158 train/s2s : 3.8925 train/learning_rate: 0.0002 eval/ctc : 4.6672 eval/s2s : 4.8830 eval/loss : 9.5502 eval/wer : 0.8977 eval/acc : 0.1860 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.00it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 117 --- train/loss : 6.6253 train/ctc : 2.7059 train/s2s : 3.9195 train/learning_rate: 0.0002 eval/ctc : 4.5678 eval/s2s : 4.9646 eval/loss : 9.5324 eval/wer : 0.8970 eval/acc : 0.1839 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 118 --- train/loss : 6.6870 train/ctc : 2.7655 train/s2s : 3.9215 train/learning_rate: 0.0002 eval/ctc : 4.4199 eval/s2s : 4.9017 eval/loss : 9.3216 eval/wer : 0.8919 eval/acc : 0.1844 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.75it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 119 --- train/loss : 6.6171 train/ctc : 2.7096 train/s2s : 3.9075 train/learning_rate: 0.0002 eval/ctc : 4.7816 eval/s2s : 4.9494 eval/loss : 9.7310 eval/wer : 0.8923 eval/acc : 0.1836 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.68it/s] --- epoch 120 --- train/loss : 6.6390 train/ctc : 2.7281 train/s2s : 3.9109 train/learning_rate: 0.0002 eval/ctc : 4.5604 eval/s2s : 4.9981 eval/loss : 9.5585 eval/wer : 0.8985 eval/acc : 0.1834 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s] --- epoch 121 --- train/loss : 6.5856 train/ctc : 2.6883 train/s2s : 3.8973 train/learning_rate: 0.0002 eval/ctc : 4.5273 eval/s2s : 4.8998 eval/loss : 9.4271 eval/wer : 0.8895 eval/acc : 0.1845 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 122 --- train/loss : 6.4885 train/ctc : 2.6342 train/s2s : 3.8543 train/learning_rate: 0.0002 eval/ctc : 4.7328 eval/s2s : 4.9184 eval/loss : 9.6512 eval/wer : 0.9001 eval/acc : 0.1851 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.63it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 123 --- train/loss : 6.5376 train/ctc : 2.6740 train/s2s : 3.8635 train/learning_rate: 0.0002 eval/ctc : 4.7663 eval/s2s : 4.9702 eval/loss : 9.7365 eval/wer : 0.8941 eval/acc : 0.1868 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.45it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 124 --- train/loss : 6.4988 train/ctc : 2.6389 train/s2s : 3.8599 train/learning_rate: 0.0002 eval/ctc : 4.8439 eval/s2s : 4.9807 eval/loss : 9.8246 eval/wer : 0.8911 eval/acc : 0.1854 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s] --- epoch 125 --- train/loss : 6.4257 train/ctc : 2.5810 train/s2s : 3.8447 train/learning_rate: 0.0002 eval/ctc : 4.9985 eval/s2s : 5.0640 eval/loss : 10.0625 eval/wer : 0.8936 eval/acc : 0.1842 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.58it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s] --- epoch 126 --- train/loss : 6.4752 train/ctc : 2.6204 train/s2s : 3.8548 train/learning_rate: 0.0002 eval/ctc : 4.8503 eval/s2s : 4.9105 eval/loss : 9.7608 eval/wer : 0.8920 eval/acc : 0.1830 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.64it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.87it/s] --- epoch 127 --- train/loss : 6.4884 train/ctc : 2.6190 train/s2s : 3.8694 train/learning_rate: 0.0001 eval/ctc : 5.0363 eval/s2s : 5.1468 eval/loss : 10.1831 eval/wer : 0.8921 eval/acc : 0.1830 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.70it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.84it/s] --- epoch 128 --- train/loss : 6.4482 train/ctc : 2.5900 train/s2s : 3.8581 train/learning_rate: 0.0001 eval/ctc : 4.9648 eval/s2s : 4.8854 eval/loss : 9.8502 eval/wer : 0.8874 eval/acc : 0.1842 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.69it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 129 --- train/loss : 6.4149 train/ctc : 2.5781 train/s2s : 3.8368 train/learning_rate: 0.0001 eval/ctc : 4.9873 eval/s2s : 5.0764 eval/loss : 10.0637 eval/wer : 0.8910 eval/acc : 0.1827 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.14it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 130 --- train/loss : 6.4227 train/ctc : 2.5821 train/s2s : 3.8406 train/learning_rate: 0.0001 eval/ctc : 5.0115 eval/s2s : 4.9145 eval/loss : 9.9260 eval/wer : 0.8866 eval/acc : 0.1811 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nannan

nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.37it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 131 --- train/loss : 6.3550 train/ctc : 2.5384 train/s2s : 3.8167 train/learning_rate: 0.0001 eval/ctc : 5.0433 eval/s2s : 5.0876 eval/loss : 10.1309 eval/wer : 0.8889 eval/acc : 0.1845 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.47it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 132 --- train/loss : 6.3446 train/ctc : 2.5334 train/s2s : 3.8112 train/learning_rate: 0.0001 eval/ctc : 4.9058 eval/s2s : 4.8962 eval/loss : 9.8020 eval/wer : 0.8830 eval/acc : 0.1824 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 133 --- train/loss : 6.3411 train/ctc : 2.5253 train/s2s : 3.8158 train/learning_rate: 0.0001 eval/ctc : 5.0717 eval/s2s : 5.0374 eval/loss : 10.1091 eval/wer : 0.8755 eval/acc : 0.1845 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nannan

nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 134 --- train/loss : 6.4452 train/ctc : 2.5937 train/s2s : 3.8515 train/learning_rate: 0.0001 eval/ctc : 5.0382 eval/s2s : 4.9168 eval/loss : 9.9550 eval/wer : 0.8833 eval/acc : 0.1862 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nannan

nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.46it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 135 --- train/loss : 6.3595 train/ctc : 2.5399 train/s2s : 3.8196 train/learning_rate: 0.0001 eval/ctc : 5.0823 eval/s2s : 5.0376 eval/loss : 10.1199 eval/wer : 0.8694 eval/acc : 0.1833 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.62it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 136 --- train/loss : 6.2757 train/ctc : 2.4839 train/s2s : 3.7918 train/learning_rate: 0.0001 eval/ctc : 4.9648 eval/s2s : 4.9231 eval/loss : 9.8879 eval/wer : 0.8760 eval/acc : 0.1840 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.51it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s] --- epoch 137 --- train/loss : 6.4195 train/ctc : 2.5827 train/s2s : 3.8367 train/learning_rate: 0.0001 eval/ctc : 5.2293 eval/s2s : 5.0484 eval/loss : 10.2776 eval/wer : 0.8782 eval/acc : 0.1866 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.57it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.72it/s] --- epoch 138 --- train/loss : 6.4159 train/ctc : 2.5802 train/s2s : 3.8358 train/learning_rate: 0.0001 eval/ctc : 5.2113 eval/s2s : 4.9180 eval/loss : 10.1293 eval/wer : 0.8747 eval/acc : 0.1859 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s] --- epoch 139 --- train/loss : 6.3295 train/ctc : 2.5188 train/s2s : 3.8107 train/learning_rate: 0.0001 eval/ctc : 5.1996 eval/s2s : 5.0505 eval/loss : 10.2501 eval/wer : 0.8757 eval/acc : 0.1841 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.75it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.83it/s] --- epoch 140 --- train/loss : 6.2596 train/ctc : 2.4745 train/s2s : 3.7850 train/learning_rate: 0.0001 eval/ctc : 5.1043 eval/s2s : 5.1086 eval/loss : 10.2130 eval/wer : 0.8682 eval/acc : 0.1876 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.82it/s] --- epoch 141 --- train/loss : 6.4073 train/ctc : 2.5652 train/s2s : 3.8420 train/learning_rate: 0.0001 eval/ctc : 5.1560 eval/s2s : 4.9750 eval/loss : 10.1310 eval/wer : 0.8730 eval/acc : 0.1840 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.62it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 142 --- train/loss : 6.3672 train/ctc : 2.5420 train/s2s : 3.8252 train/learning_rate: 0.0001 eval/ctc : 5.1685 eval/s2s : 5.0811 eval/loss : 10.2496 eval/wer : 0.8706 eval/acc : 0.1831 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s] --- epoch 143 --- train/loss : 6.2984 train/ctc : 2.4938 train/s2s : 3.8045 train/learning_rate: 0.0001 eval/ctc : 5.1751 eval/s2s : 4.8938 eval/loss : 10.0690 eval/wer : 0.8697 eval/acc : 0.1864 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 144 --- train/loss : 6.3082 train/ctc : 2.5022 train/s2s : 3.8060 train/learning_rate: 0.0001 eval/ctc : 5.2745 eval/s2s : 5.1016 eval/loss : 10.3761 eval/wer : 0.8844 eval/acc : 0.1816 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.66it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 145 --- train/loss : 6.3316 train/ctc : 2.5142 train/s2s : 3.8174 train/learning_rate: 0.0001 eval/ctc : 5.3757 eval/s2s : 4.9865 eval/loss : 10.3623 eval/wer : 0.8680 eval/acc : 0.1832 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.60it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.81it/s] --- epoch 146 --- train/loss : 6.3041 train/ctc : 2.4977 train/s2s : 3.8064 train/learning_rate: 0.0001 eval/ctc : 5.3912 eval/s2s : 4.9660 eval/loss : 10.3572 eval/wer : 0.8683 eval/acc : 0.1847 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s] --- epoch 147 --- train/loss : 6.1718 train/ctc : 2.4324 train/s2s : 3.7395 train/learning_rate: 0.0001 eval/ctc : 5.2637 eval/s2s : 4.9811 eval/loss : 10.2448 eval/wer : 0.8820 eval/acc : 0.1847 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.83it/s] --- epoch 148 --- train/loss : 6.2739 train/ctc : 2.4775 train/s2s : 3.7964 train/learning_rate: 0.0001 eval/ctc : 5.4498 eval/s2s : 5.0704 eval/loss : 10.5201 eval/wer : 0.8704 eval/acc : 0.1841 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.55it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 149 --- train/loss : 6.2611 train/ctc : 2.4690 train/s2s : 3.7921 train/learning_rate: 0.0001 eval/ctc : 5.4955 eval/s2s : 5.0015 eval/loss : 10.4970 eval/wer : 0.8684 eval/acc : 0.1823 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.61it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.69it/s] --- epoch 150 --- train/loss : 6.2660 train/ctc : 2.4992 train/s2s : 3.7668 train/learning_rate: 0.0001 eval/ctc : 5.3163 eval/s2s : 5.0459 eval/loss : 10.3622 eval/wer : 0.8734 eval/acc : 0.1837 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.83it/s] --- epoch 151 --- train/loss : 6.2079 train/ctc : 2.4543 train/s2s : 3.7536 train/learning_rate: 0.0001 eval/ctc : 5.5699 eval/s2s : 5.1366 eval/loss : 10.7065 eval/wer : 0.8644 eval/acc : 0.1851 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.58it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.70it/s] --- epoch 152 --- train/loss : 6.2287 train/ctc : 2.4417 train/s2s : 3.7871 train/learning_rate: 0.0001 eval/ctc : 5.4731 eval/s2s : 4.9824 eval/loss : 10.4555 eval/wer : 0.8508 eval/acc : 0.1883 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.56it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.73it/s] --- epoch 153 --- train/loss : 6.2143 train/ctc : 2.4430 train/s2s : 3.7714 train/learning_rate: 0.0001 eval/ctc : 5.4926 eval/s2s : 5.0861 eval/loss : 10.5787 eval/wer : 0.8609 eval/acc : 0.1871 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 2.87it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.85it/s] --- epoch 154 --- train/loss : 6.2405 train/ctc : 2.4724 train/s2s : 3.7681 train/learning_rate: 0.0001 eval/ctc : 5.5326 eval/s2s : 5.1088 eval/loss : 10.6414 eval/wer : 0.8641 eval/acc : 0.1834 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.50it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 155 --- train/loss : 6.2183 train/ctc : 2.4464 train/s2s : 3.7719 train/learning_rate: 0.0001 eval/ctc : 5.4821 eval/s2s : 5.0116 eval/loss : 10.4937 eval/wer : 0.8540 eval/acc : 0.1877 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.42it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 156 --- train/loss : 6.2605 train/ctc : 2.4798 train/s2s : 3.7807 train/learning_rate: 0.0001 eval/ctc : 5.3754 eval/s2s : 4.9878 eval/loss : 10.3632 eval/wer : 0.8588 eval/acc : 0.1845 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s] --- epoch 157 --- train/loss : 6.2181 train/ctc : 2.4409 train/s2s : 3.7772 train/learning_rate: 0.0001 eval/ctc : 5.3325 eval/s2s : 5.0767 eval/loss : 10.4092 eval/wer : 0.8625 eval/acc : 0.1862 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.68it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.77it/s] --- epoch 158 --- train/loss : 6.2230 train/ctc : 2.4478 train/s2s : 3.7752 train/learning_rate: 0.0001 eval/ctc : 5.4948 eval/s2s : 5.0612 eval/loss : 10.5560 eval/wer : 0.8647 eval/acc : 0.1858 [train]: 0%| | 0/5 [00:00<?, ?it/s]nannan

nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.78it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 159 --- train/loss : 6.2061 train/ctc : 2.4407 train/s2s : 3.7654 train/learning_rate: 0.0001 eval/ctc : 5.6033 eval/s2s : 5.0781 eval/loss : 10.6813 eval/wer : 0.8548 eval/acc : 0.1801 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.76it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 160 --- train/loss : 6.1916 train/ctc : 2.4349 train/s2s : 3.7567 train/learning_rate: 0.0001 eval/ctc : 5.5541 eval/s2s : 5.1141 eval/loss : 10.6682 eval/wer : 0.8493 eval/acc : 0.1833 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.62it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.80it/s] --- epoch 161 --- train/loss : 6.0618 train/ctc : 2.3397 train/s2s : 3.7220 train/learning_rate: 0.0000 eval/ctc : 5.4812 eval/s2s : 5.1022 eval/loss : 10.5834 eval/wer : 0.8570 eval/acc : 0.1870 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.52it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 162 --- train/loss : 6.1683 train/ctc : 2.4256 train/s2s : 3.7427 train/learning_rate: 0.0000 eval/ctc : 5.4861 eval/s2s : 4.9905 eval/loss : 10.4766 eval/wer : 0.8572 eval/acc : 0.1869 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.55it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.76it/s] --- epoch 163 --- train/loss : 6.2493 train/ctc : 2.4704 train/s2s : 3.7789 train/learning_rate: 0.0000 eval/ctc : 5.7146 eval/s2s : 5.0813 eval/loss : 10.7960 eval/wer : 0.8580 eval/acc : 0.1854 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.65it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.78it/s] --- epoch 164 --- train/loss : 6.2610 train/ctc : 2.4721 train/s2s : 3.7889 train/learning_rate: 0.0000 eval/ctc : 5.7172 eval/s2s : 5.1034 eval/loss : 10.8207 eval/wer : 0.8577 eval/acc : 0.1790 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 165 --- train/loss : 6.0859 train/ctc : 2.3641 train/s2s : 3.7219 train/learning_rate: 0.0000 eval/ctc : 5.6258 eval/s2s : 5.0231 eval/loss : 10.6490 eval/wer : 0.8590 eval/acc : 0.1820 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.49it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.66it/s] --- epoch 166 --- train/loss : 6.2488 train/ctc : 2.4680 train/s2s : 3.7808 train/learning_rate: 0.0000 eval/ctc : 5.4833 eval/s2s : 5.0314 eval/loss : 10.5147 eval/wer : 0.8558 eval/acc : 0.1870 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.45it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.71it/s] --- epoch 167 --- train/loss : 6.1344 train/ctc : 2.3942 train/s2s : 3.7402 train/learning_rate: 0.0000 eval/ctc : 5.5201 eval/s2s : 5.1154 eval/loss : 10.6355 eval/wer : 0.8567 eval/acc : 0.1853 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.67it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.68it/s] --- epoch 168 --- train/loss : 6.1516 train/ctc : 2.4127 train/s2s : 3.7388 train/learning_rate: 0.0000 eval/ctc : 5.5528 eval/s2s : 5.0850 eval/loss : 10.6378 eval/wer : 0.8559 eval/acc : 0.1838 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.35it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.74it/s] --- epoch 169 --- train/loss : 6.1128 train/ctc : 2.3775 train/s2s : 3.7353 train/learning_rate: 0.0000 eval/ctc : 5.5691 eval/s2s : 5.0203 eval/loss : 10.5895 eval/wer : 0.8553 eval/acc : 0.1857 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.03it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.75it/s] --- epoch 170 --- train/loss : 6.1213 train/ctc : 2.3781 train/s2s : 3.7432 train/learning_rate: 0.0000 eval/ctc : 5.6908 eval/s2s : 5.0534 eval/loss : 10.7443 eval/wer : 0.8602 eval/acc : 0.1846 [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.33it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 171 --- train/loss : 6.1483 train/ctc : 2.4037 train/s2s : 3.7445 train/learning_rate: 0.0000 eval/ctc : 5.6797 eval/s2s : 5.0921 eval/loss : 10.7718 eval/wer : 0.8606 eval/acc : 0.1831

yl4579 commented 2 years ago

I believe this error says one of your lines in your train_list.txt does not have a speaker number, it may look like filename.wav|text| instead of filename.wav|text|0

MMMMichaelzhang commented 2 years ago

train_list.txt val_list.txt I change "speaker_id = int(speaker_id)" to "speaker_id = 0" and when I train,I got: [train]: 0%| | 0/5 [00:00<?, ?it/s]nan nan nan nan nan nan nan nan nan nan nan nan [train]: 100%|██████████| 5/5 [00:01<00:00, 3.33it/s] [eval]: 0%| | 0/3 [00:00<?, ?it/s]nan nan nan nan [eval]: 100%|██████████| 3/3 [00:01<00:00, 2.79it/s] --- epoch 171 --- train/loss : 6.1483 train/ctc : 2.4037 train/s2s : 3.7445 train/learning_rate: 0.0005 eval/ctc : 5.6797 eval/s2s : 5.0921 eval/loss : 10.7718 eval/wer : 0.8606 eval/acc : 0.1831 here is my train.log train.log how to test the model?Thank you very much. @yl4579

yl4579 commented 2 years ago

I believe something is wrong with your labels. The loss should not be NaN and the WER should not be this high after 170 epochs of training. Can you discuss it with @Charlottecuc because it looks like she could train on this dataset with no problem? It looks like you have created so many tokens (420 tokens) and they aren't actually phonemes but syllables.

Charlottecuc commented 2 years ago

@MMMMichaelzhang The WER should be lower than 0.2 after about 20 epochs. Could you print your final text tensors in meldataset.py? The text tensor and the corrsponding index should match the sentences in your training data file. Your can check whether there is something wrong in your preprocessing steps. Besides, it seems that your word.dict file is not correct. The dict file should cover all possible Mandarin phonemes (e.g. …… …… ta1 t a1 ta2 t a2 ta3 t a3 ta4 t a4 ta5 t a5 tai1 t ai1 tai2 t ai2 tai3 t ai3 tai4 t ai4 tai5 t ai5 tan1 t an1 tan2 t an2 tan3 t an3 tan4 t an4 tan5 t an5 tang1 t ang1 tang2 t ang2 tang3 t ang3 …… ……) I suggest you add tones because the WER will be higher if you delete them.

Charlottecuc commented 2 years ago

@MMMMichaelzhang The WER should be lower than 0.2 after about 20 epochs. Could you print your final text tensors in meldataset.py? The text tensor and the corrsponding index should match the sentences in your training data file. Your can check whether there is something wrong in your preprocessing steps. Besides, it seems that your word.dict file is not correct. The dict file should cover all possible Mandarin pinyins (e.g. …… …… ta1 t a1 ta2 t a2 ta3 t a3 ta4 t a4 ta5 t a5 tai1 t ai1 tai2 t ai2 tai3 t ai3 tai4 t ai4 tai5 t ai5 tan1 t an1 tan2 t an2 tan3 t an3 tan4 t an4 tan5 t an5 tang1 t ang1 tang2 t ang2 tang3 t ang3 …… ……) I suggest you add tones because the WER will be higher if you delete them.

However, if you add all possible Mandarin pinyins, there will be too many tokens to learn. So a good choice is to split pinyins into phonemes.

Kristopher-Chen commented 2 years ago

@MMMMichaelzhang , is there some tool to convert pinyins to the phonemes?

Kristopher-Chen commented 2 years ago

@MMMMichaelzhang , is there some tool to convert pinyins to the phonemes?

Is this the format suitable? image

MMMMichaelzhang commented 2 years ago

thanks for your reply.It helps a lot.I am trying to setup again. @Charlottecuc @yl4579

MMMMichaelzhang commented 2 years ago

@Kristopher-Chen I didnt find some tool to convert pinyins to the phonemes.I just split them into an array. maybe like this,set speaker_id =0 /media/mike/yys/data_asr/SSB00800056.wav|w o3 m en1 c an1 j ia1 g uo2 x u3 d uo1 zh ong1 d a4 h uo3 d ong1|0

MMMMichaelzhang commented 2 years ago

Screenshot from 2022-06-18 08-27-07 word_index_dict.txt train_list.txt val_list.txt train.log

My train loss became negative, I don't know why。 @yl4579 @Charlottecuc

yl4579 commented 2 years ago

@MMMMichaelzhang This is expected, see #4

Kristopher-Chen commented 2 years ago

@yl4579 It seems something not ideal with the eval loss, and, though acc is quite high, wer is almost 45% in my case. I used the dict with tones(1-5). image

yl4579 commented 2 years ago

@Kristopher-Chen For some reason, your model overfits very badly because your evaluation loss starts to increase after the 40th epoch, you may want to add more data or use data augmentation. An idea training curve should look like the reply above you.

Kristopher-Chen commented 2 years ago

@MMMMichaelzhang how many hours of data did you use?

MMMMichaelzhang commented 2 years ago

about 20 hours @Kristopher-Chen

Kristopher-Chen commented 2 years ago

about 20 hours @Kristopher-Chen

It seems too limited data is used... LibriTTS includes over 500h+ of data.

Charlottecuc commented 2 years ago

about 20 hours @Kristopher-Chen

More training data is needed. I used around 400 hours of data and the WER can reach about 0.08 after epoch 80.

Kristopher-Chen commented 2 years ago

More training data is needed. I used around 400 hours of data and the WER can reach about 0.08 after epoch 80.

Yes, thank you! I'm trying to use more training data. BTW, which open source are you using?

superhg commented 2 years ago

@Charlottecuc did you add space between each pinyin? like this : ['b', 'iao1', ' ', 'g', 'an1', ' ', 'f', 'ang2', ' ', 'q', 'i3', ' ', 'b', 'i4', ' ', 'r', 'an2', ' ', 't', 'iao2', ' ', 'zh', 'eng3', ' ', 'sh', 'iii4', ' ', 'ch', 'ang3', ' ', 'zh', 'an4', ' ', 'l', 've4']