I was running the colab notebook mentioned in the read.me. It encountered error while running below code,
> from IPython.display import Audio, display
> from fam.llm.fast_inference import TTS
>
> tts = TTS()
The output
using dtype=float16
Fetching 6 files: 100%
6/6 [00:00<00:00, 204.76it/s]
number of parameters: 14.07M
2024-05-13 12:17:21 | INFO | DF | Loading model settings of DeepFilterNet3
2024-05-13 12:17:21 | INFO | DF | Using DeepFilterNet3 model at /root/.cache/DeepFilterNet/DeepFilterNet3
2024-05-13 12:17:21 | INFO | DF | Initializing model deepfilternet3
2024-05-13 12:17:21 | INFO | DF | Found checkpoint /root/.cache/DeepFilterNet/DeepFilterNet3/checkpoints/model_120.ckpt.best with epoch 120
2024-05-13 12:17:21 | INFO | DF | Running on device cuda:0
2024-05-13 12:17:21 | INFO | DF | Model loaded
Using device=cuda
Loading model ...
using dtype=float16
Time to load model: 19.70 seconds
Compiling...Can take up to 2 mins.
88 frames
/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py in run_node(tracer, node, args, kwargs, nnmodule)
1569 try:
1570 if op == "call_function":
-> 1571 return node.target(*args, *kwargs)
1572 elif op == "call_method":
1573 return getattr(args[0], node.target)(args[1:], **kwargs)
TorchRuntimeError: Failed running call_function (*(FakeTensor(..., device='cuda:0', size=(2, 16, s0, 128)), FakeTensor(..., device='cuda:0', size=(2, 16, 2048, 128), dtype=torch.float16), FakeTensor(..., device='cuda:0', size=(2, 16, 2048, 128), dtype=torch.float16)), **{'attn_mask': FakeTensor(..., device='cuda:0', size=(1, 1, s0, 2048), dtype=torch.bool), 'dropout_p': 0.0}):
Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: c10::Half and value.dtype: c10::Half instead.
from user code:
File "/content/metavoice-src/fam/llm/fast_inference_utils.py", line 131, in prefill
logits = model(x, spk_emb, input_pos)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, kwargs)
File "/content/metavoice-src/fam/llm/fast_model.py", line 160, in forward
x = layer(x, input_pos, mask)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, *kwargs)
File "/content/metavoice-src/fam/llm/fast_model.py", line 179, in forward
h = x + self.attention(self.attention_norm(x), mask, input_pos)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(args, kwargs)
File "/content/metavoice-src/fam/llm/fast_model.py", line 222, in forward
y = F.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0)
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
I was running the colab notebook mentioned in the read.me. It encountered error while running below code,
The output
using dtype=float16 Fetching 6 files: 100% 6/6 [00:00<00:00, 204.76it/s] number of parameters: 14.07M 2024-05-13 12:17:21 | INFO | DF | Loading model settings of DeepFilterNet3 2024-05-13 12:17:21 | INFO | DF | Using DeepFilterNet3 model at /root/.cache/DeepFilterNet/DeepFilterNet3 2024-05-13 12:17:21 | INFO | DF | Initializing model
deepfilternet3
2024-05-13 12:17:21 | INFO | DF | Found checkpoint /root/.cache/DeepFilterNet/DeepFilterNet3/checkpoints/model_120.ckpt.best with epoch 120 2024-05-13 12:17:21 | INFO | DF | Running on device cuda:0 2024-05-13 12:17:21 | INFO | DF | Model loaded Using device=cuda Loading model ... using dtype=float16 Time to load model: 19.70 seconds Compiling...Can take up to 2 mins.TorchRuntimeError Traceback (most recent call last) in <cell line: 4>()
2 from fam.llm.fast_inference import TTS
3
----> 4 tts = TTS()
88 frames /usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py in run_node(tracer, node, args, kwargs, nnmodule) 1569 try: 1570 if op == "call_function": -> 1571 return node.target(*args, *kwargs) 1572 elif op == "call_method": 1573 return getattr(args[0], node.target)(args[1:], **kwargs)
TorchRuntimeError: Failed running call_function(*(FakeTensor(..., device='cuda:0', size=(2, 16, s0, 128)), FakeTensor(..., device='cuda:0', size=(2, 16, 2048, 128), dtype=torch.float16), FakeTensor(..., device='cuda:0', size=(2, 16, 2048, 128), dtype=torch.float16)), **{'attn_mask': FakeTensor(..., device='cuda:0', size=(1, 1, s0, 2048), dtype=torch.bool), 'dropout_p': 0.0}):
Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: c10::Half and value.dtype: c10::Half instead.
from user code: File "/content/metavoice-src/fam/llm/fast_inference_utils.py", line 131, in prefill logits = model(x, spk_emb, input_pos) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, kwargs) File "/content/metavoice-src/fam/llm/fast_model.py", line 160, in forward x = layer(x, input_pos, mask) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/content/metavoice-src/fam/llm/fast_model.py", line 179, in forward h = x + self.attention(self.attention_norm(x), mask, input_pos) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "/content/metavoice-src/fam/llm/fast_model.py", line 222, in forward y = F.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0)
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information