wlsdml1114 / diff-svc

Singing Voice Conversion via diffusion model
GNU Affero General Public License v3.0
58 stars 19 forks source link

학습코드 실행후 Error 발생합니다 #58

Open wownrns opened 1 year ago

wownrns commented 1 year ago

(diff-svc) C:\diff-svc-main\diff-svc-main>python run.py --config training/config_nsf.yaml --exp_name test --reset | Hparams chains: ['training/config_nsf.yaml'] | Hparams: K_step: 1000, accumulate_grad_batches: 1, audio_num_mel_bins: 128, audio_sample_rate: 44100, binarization_args: {'shuffle': False, 'with_align': True, 'with_f0': True, 'with_hubert': True, 'with_spk_embed': False, 'with_wav': False}, binarizer_cls: preprocessing.SVCpre.SVCBinarizer, binary_data_dir: data/binary/atri, check_val_every_n_epoch: 10, choose_test_manually: False, clip_grad_norm: 1, config_path: training/config_nsf.yaml, content_cond_steps: [], cwt_add_f0_loss: False, cwt_hidden_size: 128, cwt_layers: 2, cwt_loss: l1, cwt_std_scale: 0.8, datasets: ['opencpop'], debug: False, dec_ffn_kernel_size: 9, dec_layers: 4, decay_steps: 40000, decoder_type: fft, dict_dir: , diff_decoder_type: wavenet, diff_loss_type: l2, dilation_cycle_length: 4, dropout: 0.1, ds_workers: 4, dur_enc_hidden_stride_kernel: ['0,2,3', '0,2,3', '0,1,3'], dur_loss: mse, dur_predictor_kernel: 3, dur_predictor_layers: 5, enc_ffn_kernel_size: 9, enc_layers: 4, encoder_K: 8, encoder_type: fft, endless_ds: False, f0_bin: 256, f0_max: 1100.0, f0_min: 40.0, ffn_act: gelu, ffn_padding: SAME, fft_size: 2048, fmax: 16000, fmin: 40, fs2_ckpt: , gaussian_start: True, gen_dir_name: , gen_tgt_spk_id: -1, hidden_size: 256, hop_size: 512, hubert_gpu: True, hubert_path: checkpoints/hubert/hubert_soft.pt, infer: False, keep_bins: 128, lambda_commit: 0.25, lambda_energy: 0.0, lambda_f0: 1.0, lambda_ph_dur: 0.3, lambda_sent_dur: 1.0, lambda_uv: 1.0, lambda_word_dur: 1.0, load_ckpt: , log_interval: 100, loud_norm: False, lr: 0.0008, max_beta: 0.02, max_epochs: 3000, max_eval_sentences: 1, max_eval_tokens: 60000, max_frames: 42000, max_input_tokens: 60000, max_sentences: 32, max_tokens: 128000, max_updates: 1000000, mel_loss: ssim:0.5|l1:0.5, mel_vmax: 1.5, mel_vmin: -6.0, min_level_db: -120, no_fs2: True, norm_type: gn, num_ckpt_keep: 10, num_heads: 2, num_sanity_val_steps: 1, num_spk: 1, num_test_samples: 0, num_valid_plots: 10, optimizer_adam_beta1: 0.9, optimizer_adam_beta2: 0.98, out_wav_norm: False, pe_ckpt: checkpoints/0102_xiaoma_pe/model_ckpt_steps_60000.ckpt, pe_enable: False, perform_enhance: True, pitch_ar: False, pitch_enc_hidden_stride_kernel: ['0,2,5', '0,2,5', '0,2,5'], pitch_extractor: parselmouth, pitch_loss: l2, pitch_norm: log, pitch_type: frame, pndm_speedup: 10, pre_align_args: {'allow_no_txt': False, 'denoise': False, 'forced_align': 'mfa', 'txt_processor': 'zh_g2pM', 'use_sox': True, 'use_tone': False}, pre_align_cls: data_gen.singing.pre_align.SingingPreAlign, predictor_dropout: 0.5, predictor_grad: 0.1, predictor_hidden: -1, predictor_kernel: 5, predictor_layers: 5, prenet_dropout: 0.5, prenet_hidden_size: 256, pretrain_fs_ckpt: , processed_data_dir: xxx, profile_infer: False, raw_data_dir: data/test, ref_norm_layer: bn, rel_pos: True, reset_phone_dict: True, residual_channels: 384, residual_layers: 20, save_best: False, save_ckpt: True, save_codes: ['configs', 'modules', 'src', 'utils'], save_f0: True, save_gt: False, schedule_type: linear, seed: 1234, sort_by_len: True, speaker_id: test, spec_max: [0.29894399642944336, 0.009016108699142933, -0.09283751994371414, 0.18668203055858612, 0.3821677267551422, 0.6378556489944458, 0.9323092103004456, 0.9430548548698425, 0.8731685280799866, 0.8604497909545898, 0.9244481921195984, 0.754655659198761, 0.90962815284729, 0.8096907138824463, 0.7326040267944336, 0.9681972861289978, 1.1001042127609253, 1.0115244388580322, 0.9521523118019104, 1.0759587287902832, 1.0344291925430298, 0.924071192741394, 1.1161046028137207, 0.9410622715950012, 0.8933808207511902, 1.0794308185577393, 1.0275195837020874, 0.8603822588920593, 1.087681770324707, 0.9405238628387451, 1.0123140811920166, 1.0991350412368774, 1.0025107860565186, 0.9227388501167297, 1.0448148250579834, 1.0738403797149658, 0.9112507104873657, 0.7928240895271301, 0.8650215864181519, 0.9054545164108276, 0.9477485418319702, 0.8715988993644714, 0.8439128994941711, 0.8672019243240356, 0.943199634552002, 0.9969291090965271, 1.018958568572998, 1.0282155275344849, 0.783663272857666, 0.8329223990440369, 0.9797446727752686, 0.9026461243629456, 0.9607696533203125, 0.93161940574646, 0.8593463897705078, 0.8714385032653809, 0.8949925303459167, 0.7425702214241028, 0.6792765259742737, 0.7918699979782104, 0.736993670463562, 0.8197360634803772, 0.7117428183555603, 0.6571913957595825, 0.663261353969574, 0.641767680644989, 0.6014213562011719, 0.4743628203868866, 0.5256550312042236, 0.6463994979858398, 0.5646763443946838, 0.44836029410362244, 0.47972768545150757, 0.3843696117401123, 0.325629323720932, 0.3660951852798462, 0.27881261706352234, 0.38427066802978516, 0.2954651415348053, 0.18855006992816925, 0.2017236351966858, 0.16352631151676178, 0.26764190196990967, 0.3713037967681885, 0.2891384959220886, 0.17770640552043915, 0.04930056259036064, 0.150832861661911, 0.1590312421321869, -0.011039091274142265, -0.06748926639556885, -0.1539364755153656, -0.14397849142551422, -0.2033642828464508, -0.25857800245285034, -0.21972408890724182, -0.09430843591690063, -0.061663251370191574, -0.20216906070709229, -0.1817052960395813, -0.12838633358478546, -0.2429756224155426, -0.30017372965812683, -0.30349302291870117, -0.2779577672481537, -0.3013155162334442, -0.27614471316337585, -0.3125353455543518, -0.2821618318557739, -0.4400882422924042, -0.19444836676120758, -0.15245670080184937, -0.05549808591604233, -0.25638583302497864, -0.27548524737358093, -0.3761153519153595, -0.41992488503456116, -0.45843884348869324, -0.5251815319061279, -0.49916592240333557, -0.4737579822540283, -0.44021207094192505, -0.4433860182762146, -0.4223054349422455, -0.43037497997283936, -0.4300977289676666, -0.5727364420890808, -0.684354305267334], spec_min: [-4.133910655975342, -4.355753421783447, -4.049771785736084, -3.9115467071533203, -4.379791259765625, -4.146684169769287, -3.918229103088379, -4.094448566436768, -4.103419303894043, -3.9473819732666016, -4.422009468078613, -4.31467342376709, -4.269201755523682, -4.0959601402282715, -4.310314178466797, -4.2953033447265625, -4.520478248596191, -4.250340938568115, -4.217737674713135, -4.457752227783203, -4.222395896911621, -4.21950101852417, -4.040409088134766, -4.18532657623291, -4.377020835876465, -4.3079915046691895, -4.29104471206665, -4.407296657562256, -4.165337085723877, -4.101104259490967, -4.361326217651367, -4.378291130065918, -4.187614917755127, -4.435909271240234, -4.243113994598389, -4.409623622894287, -4.2475972175598145, -4.027139186859131, -4.430020809173584, -4.253938674926758, -4.44111442565918, -4.154654502868652, -4.2105817794799805, -4.254512786865234, -4.2504191398620605, -4.414350509643555, -4.307179927825928, -4.239838123321533, -4.425566673278809, -4.247255325317383, -4.141347885131836, -4.263878345489502, -4.157517433166504, -3.9599316120147705, -4.2462849617004395, -4.296030044555664, -4.233526706695557, -4.198806285858154, -4.186741828918457, -4.304462432861328, -4.3174896240234375, -4.588046073913574, -4.526463985443115, -4.4504594802856445, -4.290297031402588, -4.627730846405029, -4.656083106994629, -4.459161281585693, -4.7355570793151855, -4.807618141174316, -4.630912780761719, -4.247494697570801, -4.727241516113281, -4.541767120361328, -4.318716049194336, -4.192019462585449, -4.268681526184082, -4.077663421630859, -4.189309120178223, -4.224360942840576, -4.092624187469482, -4.096431732177734, -4.2374043464660645, -4.140985012054443, -4.137366771697998, -4.237246036529541, -4.189384937286377, -4.245174884796143, -4.837279796600342, -4.458308219909668, -4.835829734802246, -4.858468055725098, -4.999994277954102, -4.9773454666137695, -4.846689701080322, -4.999994277954102, -4.976007461547852, -4.999994277954102, -4.857632637023926, -4.758609294891357, -4.999994277954102, -4.596436500549316, -4.783600807189941, -4.448403358459473, -4.230667591094971, -4.5516581535339355, -4.4043779373168945, -4.5930562019348145, -4.633569717407227, -4.776383876800537, -4.661365509033203, -4.633496284484863, -4.707913875579834, -4.318541526794434, -4.529041290283203, -4.461437702178955, -4.459624767303467, -4.7762675285339355, -4.5529866218566895, -4.656703948974609, -4.281631946563721, -4.303481101989746, -4.373120307922363, -4.48301887512207, -4.362737655639648, -4.56112003326416, -4.3292155265808105, -4.350776195526123], spk_cond_steps: [], stop_token_weight: 5.0, task_cls: training.task.SVC_task.SVCTask, test_ids: [], test_input_dir: , test_num: 0, test_prefixes: ['test'], test_set_name: test, timesteps: 1000, train_set_name: train, use_amp: True, use_crepe: True, use_denoise: False, use_energy_embed: False, use_gt_dur: False, use_gt_f0: False, use_midi: False, use_nsf: True, use_pitch_embed: True, use_pos_embed: True, use_spk_embed: False, use_spk_id: False, use_split_spk_id: False, use_uv: False, use_var_enc: False, use_vec: False, val_check_interval: 2000, valid_num: 0, valid_set_name: valid, validate: False, vocoder: network.vocoders.nsf_hifigan.NsfHifiGAN, vocoder_ckpt: checkpoints/nsf_hifigan/model, warmup_updates: 2000, wav2spec_eps: 1e-6, weight_decay: 0, win_size: 2048, work_dir: checkpoints/test, | Mel losses: {'ssim': 0.5, 'l1': 0.5} | Load HifiGAN: checkpoints/nsf_hifigan/model Traceback (most recent call last): File "C:\diff-svc-main\diff-svc-main\run.py", line 15, in run_task() File "C:\diff-svc-main\diff-svc-main\run.py", line 11, in run_task task_cls.start() File "C:\diff-svc-main\diff-svc-main\training\task\base_task.py", line 197, in start task = cls() File "C:\diff-svc-main\diff-svc-main\training\task\SVC_task.py", line 36, in init self.vocoder: BaseVocoder = get_vocoder_cls(hparams)() File "C:\diff-svc-main\diff-svc-main\network\vocoders\nsf_hifigan.py", line 17, in init self.model, self.h = load_model(model_path, device=self.device) File "C:\diff-svc-main\diff-svc-main\modules\nsf_hifigan\models.py", line 25, in load_model cp_dict = torch.load(model_path) File "C:\Users\ksjs7\anaconda3\envs\diff-svc\lib\site-packages\torch\serialization.py", line 771, in load with _open_file_like(f, 'rb') as opened_file: File "C:\Users\ksjs7\anaconda3\envs\diff-svc\lib\site-packages\torch\serialization.py", line 270, in _open_file_like return _open_file(name_or_buffer, mode) File "C:\Users\ksjs7\anaconda3\envs\diff-svc\lib\site-packages\torch\serialization.py", line 251, in init super(_open_file, self).init(open(name, mode)) PermissionError: [Errno 13] Permission denied: 'checkpoints/nsf_hifigan/model'

**위 내용은 GPU메모리가 6GB이상인 경우에 적혀있는 명령어를 작성했을때 발생하였습니다. C:\diff-svc-main\diff-svc-main\checkpoints 경로 안에 hubert,nsf_hifigan 폴더 설치는 완료했습니다.

위 에러 내용인 PermissionError: [Errno 13] Permission denied: 'checkpoints/nsf_hifigan/model' 이 부분에서 checkpoints/nsf_hifigan/model 경로 안에 nsf_hifigan 압축 파일 안에있는 model 파일을 삽입했습니다.

문제점과 해결방법을 도저히 모르겠어서 글을 게시해봅니다 ㅠㅠ...**

wownrns commented 1 year ago

cmd 관리자 권한으로 실행후

1번째 시도 (오류 발생) cd C:\diff-svc-main\diff-svc-main > python run.py --config training/config_nsf.yaml --exp_name test --reset > SystemError: initialization of _internal failed without raising an exception

위 내용을 진행하기전 ModuleNotFoundError 발생했었습니다. ModuleNotFoundError: No module named 'librosa' ModuleNotFoundError: No module named 'pycwt' 2개의 Module 설치 진행후 SystemError: initialization of _internal failed without raising an exception가 발생했습니다.

2번쨰 시도 (전과 동일 에러 발생) cd C:\diff-svc-main\diff-svc-main > conda activate diff-svc > python run.py --config training/config_nsf.yaml --exp_name test --reset >PermissionError: [Errno 13] Permission denied: 'checkpoints/nsf_hifigan/model'

또 다른 방법이 있을까요? ㅠㅠ

ParkGiBum commented 1 year ago

https://answers.microsoft.com/ko-kr/windows/forum/all/%ED%8F%B4%EB%8D%94-%EB%B0%8F-%ED%8C%8C%EC%9D%BC/4baabedd-0b29-48e1-84c7-02ff8906efd3

혹시 해당 방법으로 파일의 권한을 바꿔보실수있을까요?