pythonlessons / mltu

Machine Learning Training Utilities (for TensorFlow and PyTorch)
MIT License
160 stars 100 forks source link

train.py giving error on custom dataset #36

Closed salman1851 closed 8 months ago

salman1851 commented 8 months ago

Hi! I am trying to fine-tune the wav2vec2 model from your "10_wav2vec2_torch" tutorial. As far as I know, my dataset is in a similar format to the LJ Speech Dataset that you are using as an example. There is a 'wavs' folder which contains the audio files, and a 'metadata.csv' file that has rows of pipe-separated transcriptions. I have been able to successfully run the train.py script on the default dataset (LJ Speech Dataset), but when I use my own dataset, I get this output on the terminal. Am I missing something?

Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2ForCTC: ['wav2vec2.encoder.pos_conv_embed.conv.weight_v', 'wav2vec2.encoder.pos_conv_embed.conv.weight_g']
- This IS expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized because the shapes did not match:
- lm_head.bias: found shape torch.Size([32]) in the checkpoint and torch.Size([29]) in the model instantiated
- lm_head.weight: found shape torch.Size([32, 768]) in the checkpoint and torch.Size([29, 768]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Cuda Device Available.
INFO:WarmupCosineDecay:Epoch 1 - Learning Rate: 1e-08
  0%|                                                                                                                  | 0/18 [00:00<?, ?it/s]/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py:234: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  return padded_audios, np.array(label)
Epoch 1 - loss: 25.1576 - CER: 4.2681 - WER: 1.0000: 100%|████████████████████████████████████████████████████| 18/18 [00:08<00:00,  2.06it/s]
  0%|                                                                                                                   | 0/2 [00:00<?, ?it/s]Exception in thread Thread-19:
Exception in thread Thread-15:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
Exception in thread Thread-16:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
Exception in thread Thread-14:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Exception in thread Thread-18:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Exception in thread Thread-23:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Exception in thread Thread-21:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
Exception in thread Thread-22:
Traceback (most recent call last):
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py", line 222, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
salman1851 commented 8 months ago

metadata.csv

This is what my labels file looks like.

salman1851 commented 8 months ago

I increased the size of my dataset by a factor of 30; I now have around 4000 examples. I figured the code was not getting enough examples in the training pipeline. The terminal is not showing the same error, but it is still stuck at the first epoch. Here's the output.

Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2ForCTC: ['wav2vec2.encoder.pos_conv_embed.conv.weight_v', 'wav2vec2.encoder.pos_conv_embed.conv.weight_g']
- This IS expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized because the shapes did not match:
- lm_head.bias: found shape torch.Size([32]) in the checkpoint and torch.Size([29]) in the model instantiated
- lm_head.weight: found shape torch.Size([32, 768]) in the checkpoint and torch.Size([29, 768]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Cuda Device Available.
INFO:WarmupCosineDecay:Epoch 1 - Learning Rate: 1e-08
  0%|                                                   | 0/527 [00:00<?, ?it/s]/home/ee/anaconda3/lib/python3.9/site-packages/mltu/transformers.py:234: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  return padded_audios, np.array(label)
Epoch 1 - loss: 27.2745 - CER: 3.4600 - WER: 1.0620:   1%| | 6/527 [00:07<05:46,

I would really appreciate your help.

pythonlessons commented 8 months ago

Hey, it seems that it started training, if my example works, this also should work for you. I hope you train on gpu? try to decrease batch_size to 2, and check whether it trains. Maybe its really slow

salman1851 commented 8 months ago

I changed the batch_size to 2, and it seems to have reduced the total time per epoch, but the issue is still there. It seems that the examples in the first epoch run for the first few seconds or so, and then the pipeline gets stuck. Yes, I am training on RTX 3090. The CUDA device gets detected by PyTorch in the code. And, your example worked without any hitch.

pythonlessons commented 8 months ago

Try to run something like this in debug mode:

for data in tqdm(data_provider):
    pass (or something else)

There might be a problem that librosa library hangs while trying to read your audio. What os you use? So, try to check whether you can read your audio or not

salman1851 commented 8 months ago

When I run the tqdm code (that you mentioned above), the console throws a type error.

import tqdm
for data in tqdm(data_provider):
    print(data)

Traceback (most recent call last):

  File "/tmp/ipykernel_148805/4044221648.py", line 1, in <module>
    for data in tqdm(data_provider):

TypeError: 'module' object is not callable

I use Ubuntu 20.04. Here are the specifics:

NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

I tried reading all the audio files in the example dataset and my own dataset with the librosa library in debug mode and the system has no problem reading the files.

pythonlessons commented 8 months ago

Ubuntu should be ok, you need to do the following:

from tqdm import tqdm
for data in tqdm(data_provider):
    print(data)

If you can read data from data provider, need to investigate further (best with debug mode and to find what should be the cause).

salman1851 commented 8 months ago

So, I decreased the batch size to 1 on the original dataset (150 training examples) and it's working now. I think there's an optimal batch size number that needs to be set for different sizes of dataset. Anyway, thanks for your help!

pythonlessons commented 8 months ago

Thats strange, because I alsto trained with RTX 3090 with batch_size of 8 and everything was fine. Check nvtop command in terminal, how much GPU ram is consumed during training

salman1851 commented 8 months ago

This is what the nvtop output looks like when I run the training script on my larger dataset (4560 examples) at a batch size of 8. The console output gets stuck after a few examples in the first epoch.

train

salman1851 commented 8 months ago

Actually, I mistyped earlier. My GPU model is 2080, not 3090.

pythonlessons commented 8 months ago

Yes, it seems that cpu is idle and not doing anything for some reason...

It would be really interesting to find where is the problem. Maybe you would like to upload your training script with part of the dataset? so I could check this out

salman1851 commented 8 months ago

This is the training script. I only changed the path of the dataset.

import os
import tarfile
import pandas as pd
from tqdm import tqdm
from io import BytesIO
from urllib.request import urlopen

import torch
from torch import nn
from transformers import Wav2Vec2ForCTC
import torch.nn.functional as F

from mltu.torch.model import Model
from mltu.torch.losses import CTCLoss
from mltu.torch.dataProvider import DataProvider
from mltu.torch.metrics import CERMetric, WERMetric
from mltu.torch.callbacks import EarlyStopping, ModelCheckpoint, TensorBoard, Model2onnx, WarmupCosineDecay
from mltu.augmentors import RandomAudioNoise, RandomAudioPitchShift, RandomAudioTimeStretch

from mltu.preprocessors import AudioReader
from mltu.transformers import LabelIndexer, LabelPadding, AudioPadding

from configs import ModelConfigs

configs = ModelConfigs()

def download_and_unzip(url, extract_to="Datasets", chunk_size=1024*1024):
    http_response = urlopen(url)

    data = b""
    iterations = http_response.length // chunk_size + 1
    for _ in tqdm(range(iterations)):
        data += http_response.read(chunk_size)

    tarFile = tarfile.open(fileobj=BytesIO(data), mode="r|bz2")
    tarFile.extractall(path=extract_to)
    tarFile.close()

# dataset_path = os.path.join("Datasets", "LJSpeech-1.1")
# if not os.path.exists(dataset_path):
#     download_and_unzip("https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2", extract_to="Datasets")

# dataset_path = "Datasets/comcast_xfinity"
dataset_path = "/media/ee/New Volume/mltu/Tutorials/10_wav2vec2_torch/Datasets/comcast_xfinity_rep"
metadata_path = dataset_path + "/metadata.csv"
wavs_path = dataset_path + "/wavs/"

# Read metadata file and parse it
metadata_df = pd.read_csv(metadata_path, sep="|", header=None, quoting=3)
dataset = []
vocab = [' ', "'", 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
for file_name, transcription, normalized_transcription in metadata_df.values.tolist():
    # path = f"Datasets/comcast_xfinity/wavs/{file_name}.wav"
    path = f"/media/ee/New Volume/mltu/Tutorials/10_wav2vec2_torch/Datasets/comcast_xfinity_rep/wavs/{file_name}.wav"
    new_label = "".join([l for l in normalized_transcription.lower() if l in vocab])
    dataset.append([path, new_label])

# Create a data provider for the dataset
data_provider = DataProvider(
    dataset=dataset,
    skip_validation=True,
    # batch_size=configs.batch_size,
    batch_size=8,
    data_preprocessors=[
        AudioReader(sample_rate=16000),
        ],
    transformers=[
        LabelIndexer(vocab),
        ],
    use_cache=False,
    batch_postprocessors=[
        AudioPadding(max_audio_length=configs.max_audio_length, padding_value=0, use_on_batch=True),
        LabelPadding(padding_value=len(vocab), use_on_batch=True),
    ],
    # batch_postprocessors=[
    #     AudioPadding(max_audio_length=246000, padding_value=0, use_on_batch=True),
    #     LabelPadding(padding_value=len(vocab), use_on_batch=True),
    # ],
    use_multiprocessing=True,
    max_queue_size=10,
    workers=configs.train_workers,
    # workers=20,
)
train_dataProvider, test_dataProvider = data_provider.split(split=0.9)

# for data in tqdm(data_provider):
#     print(data)

# train_dataProvider.augmentors = [
#         RandomAudioNoise(), 
#         RandomAudioPitchShift(), 
#         RandomAudioTimeStretch()
#     ]

vocab = sorted(vocab)
configs.vocab = vocab
configs.save()

class CustomWav2Vec2Model(nn.Module):
    def __init__(self, hidden_states, dropout_rate=0.2, **kwargs):
        super(CustomWav2Vec2Model, self).__init__( **kwargs)
        pretrained_name = "facebook/wav2vec2-base-960h"
        self.model = Wav2Vec2ForCTC.from_pretrained(pretrained_name, vocab_size=hidden_states, ignore_mismatched_sizes=True)
        self.model.freeze_feature_encoder() # this part does not need to be fine-tuned

    def forward(self, inputs):
        output = self.model(inputs, attention_mask=None).logits
        # Apply softmax
        output = F.log_softmax(output, -1)
        return output

custom_model = CustomWav2Vec2Model(hidden_states = len(vocab)+1)

# put on cuda device if available
if torch.cuda.is_available():
    print('Cuda Device Available.')
    custom_model = custom_model.cuda()

# create callbacks
warmupCosineDecay = WarmupCosineDecay(
    lr_after_warmup=configs.lr_after_warmup,
    warmup_epochs=configs.warmup_epochs,
    decay_epochs=configs.decay_epochs,
    final_lr=configs.final_lr,
    initial_lr=configs.init_lr,
    verbose=True,
)
tb_callback = TensorBoard(configs.model_path + "/logs")
earlyStopping = EarlyStopping(monitor="val_CER", patience=16, mode="min", verbose=1)
modelCheckpoint = ModelCheckpoint(configs.model_path + "/model.pt", monitor="val_CER", mode="min", save_best_only=True, verbose=1)
model2onnx = Model2onnx(
    saved_model_path=configs.model_path + "/model.pt",
    input_shape=(1, configs.max_audio_length), 
    verbose=1,
    metadata={"vocab": configs.vocab},
    dynamic_axes={"input": {0: "batch_size", 1: "sequence_length"}, "output": {0: "batch_size", 1: "sequence_length"}}
)

# create model object that will handle training and testing of the network
model = Model(
    custom_model, 
    loss = CTCLoss(blank=len(configs.vocab), zero_infinity=True),
    optimizer = torch.optim.AdamW(custom_model.parameters(), lr=configs.init_lr, weight_decay=configs.weight_decay),
    metrics=[
        CERMetric(configs.vocab), 
        WERMetric(configs.vocab)
    ],
    mixed_precision=configs.mixed_precision,
)

# Save training and validation datasets as csv files
train_dataProvider.to_csv(os.path.join(configs.model_path, "train.csv"))
test_dataProvider.to_csv(os.path.join(configs.model_path, "val.csv"))

model.fit(
    train_dataProvider, 
    test_dataProvider, 
    epochs=configs.train_epochs, 
    callbacks=[
        warmupCosineDecay, 
        tb_callback, 
        earlyStopping,
        modelCheckpoint, 
        model2onnx
    ]

)

Here is a sample of the dataset.

sample_dataset.zip

pythonlessons commented 8 months ago

Hey, I tested and it seems there is some issue related to librosa. When using multiprocessing it doesn't log an error, this is why it was freezing to you, I'll make a fix and release version with bug fix. I'll let you know when you good to go

pythonlessons commented 8 months ago

try to do pip install mltu==1.1.6 and let me know if everything is working

salman1851 commented 8 months ago

I installed all the requirements in a new conda environment (python 3.8) with mltu==1.1.6 and ran the training script. The pipeline is still getting stuck; this time at the validation step. Here's the output of the console.

Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2ForCTC: ['wav2vec2.encoder.pos_conv_embed.conv.weight_v', 'wav2vec2.encoder.pos_conv_embed.conv.weight_g']
- This IS expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2ForCTC from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized because the shapes did not match:
- lm_head.bias: found shape torch.Size([32]) in the checkpoint and torch.Size([29]) in the model instantiated
- lm_head.weight: found shape torch.Size([32, 768]) in the checkpoint and torch.Size([29, 768]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Cuda Device Available.
INFO:WarmupCosineDecay:Epoch 1 - Learning Rate: 1e-08
Epoch 1 - loss: 26.0389 - CER: 4.3012 - WER: 1.0554: 100%|█| 18/18 [00:10<00:00,
  0%|                                                     | 0/2 [00:00<?, ?it/s]Exception in thread Thread-15:
Traceback (most recent call last):
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 932, in _bootstrap_inner
Exception in thread Thread-20:
Traceback (most recent call last):
Exception in thread Thread-17:
Traceback (most recent call last):
Exception in thread Thread-16:
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
Exception in thread Thread-14:
Traceback (most recent call last):
Exception in thread Thread-22:
Traceback (most recent call last):
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 932, in _bootstrap_inner
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 870, in run
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 932, in _bootstrap_inner
Traceback (most recent call last):
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 932, in _bootstrap_inner
Exception in thread Thread-23:
Traceback (most recent call last):
    self.run()
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 870, in run
Exception in thread Thread-21:
Traceback (most recent call last):
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    result = self.function(data_index)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    self.run()
    self.run()
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 870, in run
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/transformers.py", line 227, in __call__
    max_len = max([len(a) for a in audio])
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 870, in run
ValueError: max() arg is an empty sequence
    self._target(*self._args, **self._kwargs)
    self.run()
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 870, in run
    self.run()
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    result = self.function(data_index)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    self.run()
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
    self._target(*self._args, **self._kwargs)
    result = self.function(data_index)
    result = self.function(data_index)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    self._target(*self._args, **self._kwargs)
    result = self.function(data_index)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/transformers.py", line 227, in __call__
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/dataProvider.py", line 287, in __getitem__
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    result = self.function(data_index)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/transformers.py", line 227, in __call__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/transformers.py", line 227, in __call__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
    max_len = max([len(a) for a in audio])
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/torch/dataProvider.py", line 245, in worker_function
    result = self.function(data_index)
ValueError: max() arg is an empty sequence
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/transformers.py", line 227, in __call__
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/transformers.py", line 227, in __call__
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/transformers.py", line 227, in __call__
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/dataProvider.py", line 287, in __getitem__
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
    max_len = max([len(a) for a in audio])
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
    batch_data, batch_annotations = batch_postprocessor(batch_data, batch_annotations)
  File "/home/ee/anaconda3/envs/mltu/lib/python3.8/site-packages/mltu/transformers.py", line 227, in __call__
ValueError: max() arg is an empty sequence
    max_len = max([len(a) for a in audio])
ValueError: max() arg is an empty sequence
          val_loss: 24.1364 - val_CER: 1.7218 - val_WER: 1.0000: 100%|█| 2/2 [00
pythonlessons commented 8 months ago

Thanks, there was another bug in my code, you received this error because small validation dataset. But now if you pip install mltu==1.1.7 this should be solved. I appreciate that you revealed me these cases :)

salman1851 commented 8 months ago

I upgraded to mltu==1.1.7 and everything is working perfectly, for both small and large datasets, with the default batch size. Thank you for taking the time to fix the bug.

pythonlessons commented 8 months ago

Thank you for showing me these bugs, because of this, others won't see the same issues :)