第一步:/v1/speaker/create
{
"name":"Mabtoic",
"gender":"female",
"describe":"Mabtoic-女",
"seed":2322536682
}
执行接口可成功创建speaker
第二步:/v1/audio/speech
{
"voice":"Mabtoic",
"input":"ChatTTS-Forge是一个伟大的项目"
}
执行接口报如下错误:
Exception in thread Thread-7 (generate):
Traceback (most recent call last):
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/root/ChatTTS-Forge/modules/core/pipeline/generate/BatchGenerate.py", line 47, in generate
self.generate_batch(batch)
File "/root/ChatTTS-Forge/modules/core/pipeline/generate/BatchGenerate.py", line 59, in generate_batch
results = model.generate_batch(segments=segments, context=self.context)
File "/root/ChatTTS-Forge/modules/core/models/tts/ChatTtsModel.py", line 44, in generate_batch
return self.generate_batch_base(segments, context, stream=False)
File "/root/ChatTTS-Forge/modules/core/models/tts/ChatTtsModel.py", line 166, in generate_batch_base
results = infer.generate_audio(
File "/root/ChatTTS-Forge/modules/core/models/zoo/ChatTTSInfer.py", line 320, in generate_audio
data = self._generate_audio(
File "/root/ChatTTS-Forge/modules/core/models/zoo/ChatTTSInfer.py", line 297, in _generate_audio
return self.infer(
File "/root/ChatTTS-Forge/modules/core/models/zoo/ChatTTSInfer.py", line 88, in infer
return next(res_gen)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 36, in generator_context
response = gen.send(None)
File "/root/ChatTTS-Forge/modules/core/models/zoo/ChatTTSInfer.py", line 136, in _infer
for result in self.instance._infer_code(
File "/root/ChatTTS-Forge/modules/ChatTTS/ChatTTS/core.py", line 624, in _infer_code
self._apply_spk_emb(emb, params.spk_emb, input_ids, len(text))
File "/root/ChatTTS-Forge/modules/ChatTTS/ChatTTS/core.py", line 564, in _apply_spk_emb
.expand(emb.shape)
RuntimeError: The expanded size of the tensor (768) must match the existing size (384) at non-singleton dimension 2. Target sizes: [1, 10, 768]. Tensor sizes: [1, 1, 384]
确认清单
Forge Commit 或者 Tag
master
Python 版本
3.10
PyTorch 版本
2.3.3
操作系统信息
ubuntu 12.4
BUG 描述
使用/v1/speaker/create创建了一个spaker,调用/v1/audio/speech接口时报错
BUG 端点
/v1/audio/speech
复现参数
第一步:/v1/speaker/create { "name":"Mabtoic", "gender":"female", "describe":"Mabtoic-女", "seed":2322536682 } 执行接口可成功创建speaker 第二步:/v1/audio/speech { "voice":"Mabtoic", "input":"ChatTTS-Forge是一个伟大的项目" } 执行接口报如下错误: Exception in thread Thread-7 (generate): Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/root/ChatTTS-Forge/modules/core/pipeline/generate/BatchGenerate.py", line 47, in generate self.generate_batch(batch) File "/root/ChatTTS-Forge/modules/core/pipeline/generate/BatchGenerate.py", line 59, in generate_batch results = model.generate_batch(segments=segments, context=self.context) File "/root/ChatTTS-Forge/modules/core/models/tts/ChatTtsModel.py", line 44, in generate_batch return self.generate_batch_base(segments, context, stream=False) File "/root/ChatTTS-Forge/modules/core/models/tts/ChatTtsModel.py", line 166, in generate_batch_base results = infer.generate_audio( File "/root/ChatTTS-Forge/modules/core/models/zoo/ChatTTSInfer.py", line 320, in generate_audio data = self._generate_audio( File "/root/ChatTTS-Forge/modules/core/models/zoo/ChatTTSInfer.py", line 297, in _generate_audio return self.infer( File "/root/ChatTTS-Forge/modules/core/models/zoo/ChatTTSInfer.py", line 88, in infer return next(res_gen) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 36, in generator_context response = gen.send(None) File "/root/ChatTTS-Forge/modules/core/models/zoo/ChatTTSInfer.py", line 136, in _infer for result in self.instance._infer_code( File "/root/ChatTTS-Forge/modules/ChatTTS/ChatTTS/core.py", line 624, in _infer_code self._apply_spk_emb(emb, params.spk_emb, input_ids, len(text)) File "/root/ChatTTS-Forge/modules/ChatTTS/ChatTTS/core.py", line 564, in _apply_spk_emb .expand(emb.shape) RuntimeError: The expanded size of the tensor (768) must match the existing size (384) at non-singleton dimension 2. Target sizes: [1, 10, 768]. Tensor sizes: [1, 1, 384]
期望结果
生成语音
实际结果
生成语音
错误信息
No response