jonatasgrosman / huggingsound

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
MIT License
430 stars 42 forks source link

CUDA error #95

Open Symfomany opened 1 year ago

Symfomany commented 1 year ago

Bonjour,

Lors de min finetuning j'ai une erreur:

RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Voici mon bout de code

`import torch import shutil import os os.environ['CUDA_LAUNCH_BLOCKING'] = "1"

from huggingsound import TrainingArguments, ModelArguments, SpeechRecognitionModel, TokenSet device = "cuda" if torch.cuda.is_available() else "cpu"

model = SpeechRecognitionModel("jonatasgrosman/wav2vec2-large-xlsr-53-french", device=device) output_dir = "/content/drive/MyDrive/wav-example/output2"

for filename in os.listdir(output_dir): file_path = os.path.join(output_dir, filename) try: if os.path.isfile(file_path) or os.path.islink(file_path): os.unlink(file_path) elif os.path.isdir(file_path): shutil.rmtree(file_path) except Exception as e: print(f"Failed to delete {file_path}. Reason: {e}")

first of all, you need to define your model's token set

however, the token set is only needed for non-finetuned models

if you pass a new token set for an already finetuned model, it'll be ignored during training

Notez que l'ajout de ces tokens est crucial, car leur absence pourrait affecter les performances du modèle ou même entraîner des erreurs lors de l'entraînement ou de l'inférence.

tokens = [ "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'", "", "|", "", "", "" ] token_set = TokenSet(tokens)

define your train/eval data

train_data = [ {"path": "/content/drive/MyDrive/wav-example/audio4.wav", "transcription": "bonjour je m'appelle Manuel je développe sous Androïd en Kotlin je fais des applications mobiles pour la société forestière je travaille dans la classification et reconnaissance vocale dans les essences et dans le domaine de la foresterie merci"}, ] eval_data = [ {"path": "/content/drive/MyDrive/wav-example/audio5.wav", "transcription": "je m'appelle Julien je développe sous Androïd fullstack pour la société forestière"}, ]

the lines below will load the training and model arguments objects,

you can check the source code (huggingsound.trainer.TrainingArguments and huggingsound.trainer.ModelArguments) to see all the available arguments

training_args = TrainingArguments( learning_rate=3e-4, max_steps=1000, eval_steps=200, per_device_train_batch_size=2, per_device_eval_batch_size=2, ) model_args = ModelArguments( activation_dropout=0.1, hidden_dropout=0.1, )

evaluation = model.evaluate(eval_data)

print(evaluation)

and finally, fine-tune your model

model.finetune( output_dir, train_data=train_data, eval_data=eval_data, # the eval_data is optional token_set=token_set, training_args=training_args, model_args=model_args, )`

Sous Google Collab Pro + sous une carte GPU avec Cuda NVidia A100

image

Symfomany commented 1 year ago

C'est bien sur GPU A100 qu'il y a un problème, car sous V100 c'est good !

Symfomany commented 1 year ago

Une idée ?

Symfomany commented 1 year ago

Sur A100 GPU pardon