coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
35.66k stars 4.37k forks source link

[Bug] AttributeError: 'int' object has no attribute 'device' #3996

Open CrackerHax opened 2 months ago

CrackerHax commented 2 months ago

Describe the bug

example code gives error when saving.

To Reproduce

import os
import time
import torch
import torchaudio
from TTS.tts.configs.xtts_config import XttsConfig
from TTS.tts.models.xtts import Xtts

print("Loading model...")
config = XttsConfig()
config.load_json("/path/to/xtts/config.json")
model = Xtts.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir="/path/to/xtts/", use_deepspeed=True)
model.cuda()

print("Computing speaker latents...")
gpt_cond_latent, speaker_embedding = model.get_conditioning_latents(audio_path=["reference.wav"])

print("Inference...")
t0 = time.time()
chunks = model.inference_stream(
    "It took me quite a long time to develop a voice and now that I have it I am not going to be silent.",
    "en",
    gpt_cond_latent,
    speaker_embedding
)

wav_chuncks = []
for i, chunk in enumerate(chunks):
    if i == 0:
        print(f"Time to first chunck: {time.time() - t0}")
    print(f"Received chunk {i} of audio length {chunk.shape[-1]}")
    wav_chuncks.append(chunk)
wav = torch.cat(wav_chuncks, dim=0)
torchaudio.save("xtts_streaming.wav", wav.squeeze().unsqueeze(0).cpu(), 24000)

Expected behavior

expect it to save a wav file

Logs

Traceback (most recent call last):

    if elements.device.type == "mps" and not is_torch_greater_or_equal_than_2_4:
AttributeError: 'int' object has no attribute 'device'

Environment

{
    "CUDA": {
        "GPU": [
            "NVIDIA GeForce RTX 3080",
            "NVIDIA GeForce RTX 3080"
        ],
        "available": true,
        "version": "12.4"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.4.1+cu124",
        "TTS": "0.22.0",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.10.14",
        "version": "#1 SMP Thu Jan 11 04:09:03 UTC 2024"
    }
}

Additional context

AttributeError: 'int' object has no attribute 'device'

eginhard commented 2 months ago

Try using our fork (available via pip install coqui-tts). This repo is not updated anymore and the streaming code here doesn't work with recent versions of transformers - that's probably the reason, but hard to tell because you didn't include the full error log.

stale[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.