openvpi / DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Apache License 2.0
2.73k stars 288 forks source link

AttributeError on ReFlow #181

Closed colstone closed 7 months ago

colstone commented 7 months ago

python ver: 3.10.13 torch ver: 2.2.2+cu121 (Now) / 1.13.1+cu117 (Before) traceback:

Traceback (most recent call last):
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/scripts/train.py", line 31, in <module>
    run_task()
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/scripts/train.py", line 27, in run_task
    task_cls.start()
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/basics/base_task.py", line 467, in start
    trainer.fit(task, ckpt_path=get_latest_checkpoint_path(work_dir))
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 520, in fit
    call._call_and_handle_interrupt(
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 559, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 935, in _run
    results = self._run_stage()
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 976, in _run_stage
    self._run_sanity_check()
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py", line 1005, in _run_sanity_check
    val_loop.run()
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/loops/utilities.py", line 177, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 115, in run
    self._evaluation_step(batch, batch_idx, dataloader_idx)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 375, in _evaluation_step
    output = call._call_strategy_hook(trainer, hook_name, *step_kwargs.values())
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 288, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 378, in validation_step
    return self.model.validation_step(*args, **kwargs)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/basics/base_task.py", line 272, in validation_step
    losses, weight = self._validation_step(sample, batch_idx)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/training/acoustic_task.py", line 158, in _validation_step
    mel_out: ShallowDiffusionOutput = self.run_model(sample, infer=True)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/training/acoustic_task.py", line 122, in run_model
    output: ShallowDiffusionOutput = self.model(
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/modules/toplevel.py", line 105, in forward
    mel_pred = self.diffusion(condition, src_spec=src_mel, infer=True)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/modules/diffusion/RectifiedFlow.py", line 91, in forward
    x = self.inference(cond, b=b, x_start=spec, device=device)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/modules/diffusion/RectifiedFlow.py", line 222, in inference
    x, _ = algorithm_fn(x,t_start+ i*dt, dt, cond, model_fn=self.denoise_fn)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/modules/diffusion/RectifiedFlow.py", line 102, in sample_rk4
    k_1 = model_fn(x, self.timesteps * t, cond)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/modules/diffusion/wavenet.py", line 103, in forward
    diffusion_step = self.diffusion_embedding(diffusion_step)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/linjl/anaconda3/envs/diffv2-2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/linjl/diffsinger-Reflow/DiffSinger-RectifiedFlow/modules/diffusion/wavenet.py", line 23, in forward
    device = x.device
AttributeError: 'float' object has no attribute 'device'

acoustic config:

base_config: configs/acoustic.yaml

raw_data_dir:
  - /home/linjl/DiffSinger/data/A/raw
  - /home/linjl/DiffSinger/data/B/raw
  - /home/linjl/DiffSinger/data/C/raw
  - /home/linjl/DiffSinger/data/D/raw
  - /home/linjl/DiffSinger/data/E/raw
  - /home/linjl/DiffSinger/data/F/raw
speakers:
  - A
  - B
  - C
  - D
  - E
  - F
spk_ids: []
test_prefixes:
  - 0:dayu_1_Track_seg000
  - 0:dayu_1_Track_seg042
  - 0:shanheling_1_shanheling1_seg021
  - 0:shanheling_2_shanheling2_seg017
  - 1:bahuiyipinhaogeini_1
  - 1:bulaomeng_1
  - 1:changanguniang_15
  - 1:chiling_10
  - 2:blue_10
  - 2:dayu_15
  - 2:ruobani_0
  - 2:yunzhixia_6
  - 3:002_04
  - 3:004_01
  - 3:005_05
  - 3:012_03
  - 4:bainiaoguohetanMixdown1_1_21
  - 4:baiwuyouMixdown1_1_10
  - 4:jieyueMixdown1_1_9
  - 4:bulaomengzhuyin_1_18
  - 5:huibuzoudeyinghuo_13
  - 5:nimingdehaoyou_16
  - 5:qidaiai_22
  - 5:ringring_20
dictionary: dictionaries/opencpop-extension.txt
binary_data_dir: data/multi_langs_Chinese_Only_ReFlow/binary
binarization_args:
  num_workers: 0

use_spk_id: true
num_spk: 6
#use_energy_embed: false
#use_breathiness_embed: true
use_key_shift_embed: true
use_speed_embed: true
#use_voicing_embed: true
#use_tension_embed: true

augmentation_args:
  random_pitch_shifting:
    enabled: true
    range: [-5., 5.]
    scale: 0.5
  random_time_stretching:
    enabled: true
    range: [0.5, 2.]
    domain: log  # or linear
    scale: 0.5

residual_channels: 512
residual_layers: 20

optimizer_args:
  lr: 0.0008
lr_scheduler_args:
  scheduler_cls: torch.optim.lr_scheduler.StepLR
  step_size: 50000
  gamma: 0.5
max_batch_frames: 48000
max_batch_size: 48
max_updates: 500000

num_valid_plots: 40
val_with_vocoder: true
val_check_interval: 2000
num_ckpt_keep: 5
permanent_ckpt_start: 200000
permanent_ckpt_interval: 40000
pl_trainer_devices: 'auto'
pl_trainer_precision: '16-mixed'
nccl_p2p: true

diff_loss_type: l2_rf_norm
diffusion_type: 'RectifiedFlow'
diff_accelerator: 'rk4'
diff_speedup: 100
timestep_type: 'continuous'
vocoder_ckpt: /home/linjl/DiffSinger/checkpoints/nsf_hifigan/model
yqzhishen commented 7 months ago

Should be fixed in d786ed1. Thanks for reporting