suno-ai / bark

🔊 Text-Prompted Generative Audio Model
MIT License
35.2k stars 4.13k forks source link

Trying to run with half precision gives error "RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'" #532

Open jferments opened 6 months ago

jferments commented 6 months ago

I am trying to follow the instructions for bark from the HuggingFace Bark Docs and I saw the part where it said that I can reduce memory footprint by running in half precision by doing the following:

model = BarkModel.from_pretrained("suno/bark-small", torch_dtype=torch.float16).to('cpu')

However when I do this, it crashes with the following error:

File ~/anaconda3/lib/python3.11/site-packages/transformers/models/bark/modeling_bark.py:195, in BarkLayerNorm.forward(self, input)
    194 def forward(self, input):
--> 195     return F.layer_norm(input, self.weight.shape, self.weight, self.bias, eps=1e-5)

File ~/anaconda3/lib/python3.11/site-packages/torch/nn/functional.py:2546, in layer_norm(input, normalized_shape, weight, bias, eps)
   2542 if has_torch_function_variadic(input, weight, bias):
   2543     return handle_torch_function(
   2544         layer_norm, (input, weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps
   2545     )
-> 2546 return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)

RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

What is causing this? Is there something else I need to add to my code?

swj0418 commented 5 months ago

In model = BarkModel.from_pretrained("suno/bark-small", torch_dtype=torch.float16).to('cpu'), you are setting the dtata type to a half float try loading it in full float model = BarkModel.from_pretrained("suno/bark-small", torch_dtype=torch.float32).to('cpu').