sizhelee / Diff-BGM

official code for CVPR'24 paper Diff-BGM
43 stars 4 forks source link

About Code Integrity? #4

Closed zhanghongyong123456 closed 1 week ago

zhanghongyong123456 commented 3 months ago
  1. about data : data/musicalion_solo_piano_4_bin_pnt"

  2. about pre_model: pretrained/a2s/a2s-stage3a.pt"

  3. about pre_model: pretrained/pnotree_20/train_20-last-model.pt" I can't find the above data,

  4. when i run train python diffbgm/main.py --model ldm_chd8bar --output_dir ./result/symmv (diff-bgm) D:\2023project\project\0win_os\13Audio\Diff-BGM>python diffbgm/main.py --model ldm_chd8bar --output_dir ./result/symmv load train valid set with: {'use_track': [0, 1, 2]} D:\2023project\project\0win_os\13Audio\Diff-BGM\diffbgm\data\dataset.py:99: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). self.visual = torch.tensor(self.visual[idx_ls, :]) Dataloader ready: batch_size=32, num_workers=4, pin_memory=True, {'use_track': [0, 1, 2]} Total parameters: 41476098 Creating new log folder as ./result/symmv/07-24_141153 { "attention_levels": [ 2, 3 ], "batch_size": 32, "channel_multipliers": [ 1, 2, 4, 4 ], "channels": 64, "chd_hidden_dim": 512, "chd_input_dim": 36, "chd_n_step": 32, "chd_z_dim": 512, "chd_z_input_dim": 512, "cond_type": "visual", "d_cond": 512, "fp16": true, "img_h": 128, "img_w": 128, "in_channels": 2, "latent_scaling_factor": 0.18215, "learning_rate": 5e-05, "linear_end": 0.012, "linear_start": 0.00085, "max_epoch": 100, "max_grad_norm": 10, "n_heads": 4, "n_res_blocks": 2, "n_steps": 1000, "num_workers": 4, "out_channels": 2, "pin_memory": true, "tf_layers": 1, "use_enc": true } Epoch 0: 0%| | 0/655 [00:11<?, ?it/s] Traceback (most recent call last): File "D:\2023project\project\0win_os\13Audio\Diff-BGM\diffbgm\main.py", line 71, in config.train() File "D:\2023project\project\0win_os\13Audio\Diff-BGM\diffbgm\train__init.py", line 49, in train learner.train(max_epoch=self.params.max_epoch) File "D:\2023project\project\0win_os\13Audio\Diff-BGM\diffbgm\learner.py", line 133, in train for _step, batch in enumerate( File "H:\Anaconda3\envs\diff-bgm\lib\site-packages\tqdm\std.py", line 1195, in iter for obj in iterable: File "H:\Anaconda3\envs\diff-bgm\lib\site-packages\torch\utils\data\dataloader.py", line 441, in iter__ return self._get_iterator() File "H:\Anaconda3\envs\diff-bgm\lib\site-packages\torch\utils\data\dataloader.py", line 388, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "H:\Anaconda3\envs\diff-bgm\lib\site-packages\torch\utils\data\dataloader.py", line 1042, in init w.start() File "H:\Anaconda3\envs\diff-bgm\lib\multiprocessing\process.py", line 121, in start self._popen = self._Popen(self) File "H:\Anaconda3\envs\diff-bgm\lib\multiprocessing\context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "H:\Anaconda3\envs\diff-bgm\lib\multiprocessing\context.py", line 327, in _Popen return Popen(process_obj) File "H:\Anaconda3\envs\diff-bgm\lib\multiprocessing\popen_spawn_win32.py", line 93, in init reduction.dump(process_obj, to_child) File "H:\Anaconda3\envs\diff-bgm\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'get_train_val_dataloaders..' Traceback (most recent call last): File "", line 1, in File "H:\Anaconda3\envs\diff-bgm\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "H:\Anaconda3\envs\diff-bgm\lib\multiprocessing\spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input

  5. when i infer use three model:

image I have this error : D:\2023project\project\0win_os\13Audio\Diff-BGM>python diffbgm/inference_sdf.py --model_dir=checkpoints/Diff-BGM_weights_best.pt --uncond_scale=5. Traceback (most recent call last): File "D:\2023project\project\0win_os\13Audio\Diff-BGM\diffbgm\inference_sdf.py", line 53, in from models.visual_encoder import VisualEncoder ModuleNotFoundError: No module named 'models.visual_encoder'

Please point me to any errors or code and data missing, thank you

sizhelee commented 1 week ago
  1. You can find the corresponding data(s) about the original POP909 dataset and the pretrained models in the Polyffusion repository.
  2. For the ModuleNotFoundError, we have updated the 'inference_sdf.py' file. Please refer to the latest version.
  3. For the error about the dataloader, we re-split the dataset in our paper (which is not the same as POP909). The new split file is updated.