Open sully90 opened 2 years ago
cc @patrickvonplaten
Hey @sully90,
The notebook works on the colab for me, but I haven't tested it on GCP.
From the error message, it looks like there is a problem with fp16
- could you in a first step maybe try to disable fp16?
E.g. remove the:
fp16=True.
statement?
Converted notebook to .py file I am also facing same issue. Tried removing fp16= True but the issue persists. @patrickvonplaten plzz help to solve this issue
@sully90 Did you solve this issue?
Could you guys make me a reproducible colab so that I can reproduce the error? :-) This would be great!
I have the same issue. It worked a couple of days ago with no changes done to the code.
Getting same issue on Colab without any changes to notebook -- ie: issue on original notebook. Sharing notebook https://colab.research.google.com/drive/18uGFjmoTVEKDI-2Nd9kgoSwQG4H-0Pzx?usp=sharing
Hey Getting the same issue when running the standard notebook:
Please assist
I can reproduce now! Thanks for telling me!
Not 100% sure what the error is for now- will take a look tomorrow!
Should be fixed now: https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Fine_tuning_Wav2Vec2_for_English_ASR.ipynb
Can you try it out ? :-)
Should be fixed now: https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Fine_tuning_Wav2Vec2_for_English_ASR.ipynb
Can you try it out ? :-)
Works now; thanks so much @patrickvonplaten; your contribution to the open source asr community is outstanding
Should be fixed now: https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Fine_tuning_Wav2Vec2_for_English_ASR.ipynb
Can you try it out ? :-)
Hey, could explain exactly what the problem was? I suddenly received the same error with no changes to the code, so I am wondering if the same changes were reproducible for my wav2vec project as well. I'm not sure if this was caused by a recent update.
The problem was that the Transformers version that was used was too old. Didn't dive super deep into it though. Maybe updating your Transformers version should do the trick @jovan3600 ?
The problem was that the Transformers version that was used was too old. Didn't dive super deep into it though. Maybe updating your Transformers version should do the trick @jovan3600 ?
Yeah I tried that but unfortunately it didn't change anything. I'm not sure what else could be the problem. All notebooks I made that use wav2vec have the same error now :(
Hmmm, not really sure what could be the problem. In the new Transformers versions > 4.17 whenever the runtime is set to GPU, which can be checked with torch.cuda.is_available()
then the Trainer should automatically put both the inputs and the model on GPU. Could you maybe put torch.cuda.is_available()
statements before the bug and see what they give?
Hi Patrick. I have the same problem. I tried to update both libraries transformers and datasets to the latest version and tried to add the statement "if torch.cuda.is_available()". But I still receive the same error because CUDA is available. Are there other ways to solve the problem?
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
Is there a way to put the "input" to cuda?
Gently ping @sanchit-gandhi
Hey @sofidipace! Could you please share:
!transformers-cli env
from a Colab cell)Hey @sanchit-gandhi of course! Versions that I installed (latest ones, but I got the same error with previous versions as well) Transformer version: 4.26.0 datasets version: 2.8.0 Hier the code: https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/speech_recognition.ipynb#scrollTo=tborvC9hx88e
Hier !transformers-cli env
Hier the snippet:
import torch
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional, Union
@dataclass
class DataCollatorCTCWithPadding:
processor: Wav2Vec2Processor
padding: Union[bool, str] = True
max_length: Optional[int] = None
max_length_labels: Optional[int] = None
pad_to_multiple_of: Optional[int] = None
pad_to_multiple_of_labels: Optional[int] = None
def __call__(self, features: List[Dict[str, Union[List[int], torch.Tensor]]]) -> Dict[str, torch.Tensor]:
# split inputs and labels since they have to be of different lenghts and need
# different padding methods
input_features = [{"input_values": feature["input_values"]} for feature in features] #list
label_features = [{"input_ids": feature["labels"]} for feature in features]
batch = self.processor.pad(
input_features,
padding=self.padding,
max_length=self.max_length,
pad_to_multiple_of=self.pad_to_multiple_of,
return_tensors="pt",
)
with self.processor.as_target_processor():
labels_batch = self.processor.pad(
label_features,
padding=self.padding,
max_length=self.max_length_labels,
pad_to_multiple_of=self.pad_to_multiple_of_labels,
return_tensors="pt",
)
# replace padding with -100 to ignore loss correctly
labels = labels_batch["input_ids"].masked_fill(labels_batch.attention_mask.ne(1), -100)
batch["labels"] = labels
return batch
data_collator = DataCollatorCTCWithPadding(processor=processor, padding=True)
wer_metric = load_metric("wer")
def compute_metrics(pred):
pred_logits = pred.predictions
pred_ids = np.argmax(pred_logits, axis=-1)
pred.label_ids[pred.label_ids == -100] = processor.tokenizer.pad_token_id
pred_str = processor.batch_decode(pred_ids)
# we do not want to group tokens when computing the metrics
label_str = processor.batch_decode(pred.label_ids, group_tokens=False)
wer = wer_metric.compute(predictions=pred_str, references=label_str)
return {"wer": wer}
from transformers import AutoModelForCTC
model = AutoModelForCTC.from_pretrained(
model_checkpoint,
ctc_loss_reduction="mean",
pad_token_id=processor.tokenizer.pad_token_id,
)
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir=repo_name,
group_by_length=True,
per_device_train_batch_size=32,
evaluation_strategy="steps",
num_train_epochs=30,
fp16=True,
gradient_checkpointing=True,
save_steps=500,
eval_steps=500,
logging_steps=500,
learning_rate=1e-4,
weight_decay=0.005,
warmup_steps=1000,
save_total_limit=2,
push_to_hub=True,
)
from transformers import Trainer
trainer = Trainer(
model=model,
data_collator=data_collator,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=timit["train"],
eval_dataset=timit["test"],
tokenizer=processor.feature_extractor,
)
if torch.cuda.is_available():
trainer.train() <------ ERROR
I just got it to run. I just commented out the versions
!pip install datasets ==1.14
!pip install transformers ==4.11.3
and got me a huggingface write role token
And fyi you now have to download timit manually ;)
Hey @sofidipace - thanks for sharing your code! Confirming that you are able to run the notebook by commenting out the pinned transformers/datasets versions?
well, I got the problem with downloading timit. Thats why I switched to https://colab.research.google.com/github/patrickvonplaten/notebooks/blob/master/Fine_Tune_XLS_R_on_Common_Voice.ipynb#scrollTo=9fRr9TG5pGBl
Cool! There are over 150 datasets on the Hub you can use for ASR: https://huggingface.co/datasets?task_categories=task_categories:automatic-speech-recognition&sort=downloads
You can just change the dataset id in the load_dataset
function to whichever dataset you prefer 🚀
I would personally recommend Common Voice 11: it builds on the original common voice corpus with more data and speakers per language
You just need to agree to the terms of use on the Hub: https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0
And add use_auth_token=True
to load_dataset
:
common_voice_train = load_dataset("mozilla-foundation/common_voice_11_0", "tr", split="train+validation", use_auth_token=True)
common_voice_test = load_dataset("mozilla-foundation/common_voice_11_0", "tr", split="test", use_auth_token=True)
Thank you very much @sanchit-gandhi
FYI
And add use_auth_token=True to load_dataset:
This is not required anymore, this is retrieved automatically if you have logged in with huggingface-cli login
When trying to run the Notebook https://github.com/huggingface/blog/blob/main/notebooks/17_fine_tune_wav2vec2_for_english_asr.ipynb on a GCP Notebook instance I get the below error when calling
trainer.train()
:CUDA is enabled and model is successfully loaded onto the GPU:
Appreciate any help!