SayanoAI / Comfy-RVC

ComfyUI custom nodes for RVC related inference and image generation
MIT License
12 stars 2 forks source link

Failed to Train Model #11

Open nux1111 opened 2 weeks ago

nux1111 commented 2 weeks ago

File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\Comfy-RVC\custom_nodes\rvc_nodes.py", line 482, in train_model assert os.path.isfile(model_path), f"Failed to train model {model_path}..." ^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError: Failed to train model D:\ComfyUI_windows_portable\ComfyUI\models\RVC\Encyclopedia_40k.pth...

SayanoAI commented 2 weeks ago

what does the line above that say? The assertion error occurs if one of the previous steps errored

nux1111 commented 1 week ago

@SayanoAI this is most of the log: 2024-09-10 01:45:15.740247 before remix: shape=(48000,), max=0.949999988079071, min=-0.41162481904029846, mean=5.2129737014183775e-05 sr=16000 2024-09-10 01:45:15.741252 after remix: shape=(48000,), max=0.949999988079071, min=-0.41162481904029846, mean=5.2129737014183775e-05, sr=16000 2024-09-10 01:45:15.741252 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\25_0.wav' audio.ndim=1 audio.max()=0.95 audio.min()=-0.41162482 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.744927 before remix: shape=(29888,), max=0.6256073713302612, min=-0.4884093105792999, mean=7.481988723156974e-05 sr=16000 2024-09-10 01:45:15.745459 after remix: shape=(29888,), max=0.6256073713302612, min=-0.4884093105792999, mean=7.481988723156974e-05, sr=16000 2024-09-10 01:45:15.745983 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\24_3.wav' audio.ndim=1 audio.max()=0.6256074 audio.min()=-0.4884093 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.765909 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:15.790671 before remix: shape=(48000,), max=0.6823728084564209, min=-0.4320821762084961, mean=-2.433846384519711e-05 sr=16000 2024-09-10 01:45:15.791175 after remix: shape=(48000,), max=0.6823728084564209, min=-0.4320821762084961, mean=-2.433846384519711e-05, sr=16000 2024-09-10 01:45:15.791703 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\23_0.wav' audio.ndim=1 audio.max()=0.6823728 audio.min()=-0.43208218 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.794322 before remix: shape=(46891,), max=0.5688475370407104, min=-0.31961020827293396, mean=8.274466381408274e-06 sr=16000 2024-09-10 01:45:15.794844 after remix: shape=(46891,), max=0.5688475370407104, min=-0.31961020827293396, mean=8.274466381408274e-06, sr=16000 2024-09-10 01:45:15.795372 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\21_3.wav' audio.ndim=1 audio.max()=0.56884754 audio.min()=-0.3196102 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.820398 get_f0 rmvpe+ unused params: {}2024-09-10 01:45:15.822605 get_f0 rmvpe+ unused params: {}

2024-09-10 01:45:15.849219 before remix: shape=(48000,), max=0.780711829662323, min=-0.48020273447036743, mean=-5.822262755827978e-05 sr=16000 2024-09-10 01:45:15.850217 after remix: shape=(48000,), max=0.780711829662323, min=-0.48020273447036743, mean=-5.822262755827978e-05, sr=16000 2024-09-10 01:45:15.850217 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\25_1.wav' audio.ndim=1 audio.max()=0.7807118 audio.min()=-0.48020273 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.861821 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:15.863390 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:15.876667 before remix: shape=(48000,), max=0.6691827774047852, min=-0.35113775730133057, mean=0.00011518812243593857 sr=16000 2024-09-10 01:45:15.877172 after remix: shape=(48000,), max=0.6691827774047852, min=-0.35113775730133057, mean=0.00011518812243593857, sr=16000 2024-09-10 01:45:15.877710 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\26_1.wav' audio.ndim=1 audio.max()=0.6691828 audio.min()=-0.35113776 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.898557 before remix: shape=(47545,), max=0.7302079200744629, min=-0.3348364233970642, mean=9.757435464052833e-07 sr=16000 2024-09-10 01:45:15.900120 after remix: shape=(47545,), max=0.7302079200744629, min=-0.3348364233970642, mean=9.757435464052833e-07, sr=160002024-09-10 01:45:15.900120 get_f0 rmvpe+ unused params: {}

2024-09-10 01:45:15.900638 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\22_3.wav' audio.ndim=1 audio.max()=0.7302079 audio.min()=-0.33483642 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.929396 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:15.942345 before remix: shape=(48000,), max=0.6909680962562561, min=-0.42908695340156555, mean=7.113168862815655e-07 sr=16000 2024-09-10 01:45:15.944483 after remix: shape=(48000,), max=0.6909680962562561, min=-0.42908695340156555, mean=7.113168862815655e-07, sr=16000 2024-09-10 01:45:15.947061 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\22_1.wav' audio.ndim=1 audio.max()=0.6909681 audio.min()=-0.42908695 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.966268 before remix: shape=(46632,), max=0.6486619710922241, min=-0.27166059613227844, mean=-5.155384133104235e-05 sr=16000 2024-09-10 01:45:15.966788 after remix: shape=(46632,), max=0.6486619710922241, min=-0.27166059613227844, mean=-5.155384133104235e-05, sr=16000 2024-09-10 01:45:15.967839 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\23_3.wav' audio.ndim=1 audio.max()=0.648662 audio.min()=-0.2716606 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:15.970985 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:15.992779 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:16.006462 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:16.008018 before remix: shape=(45410,), max=0.587701678276062, min=-0.33347395062446594, mean=-2.629904520290438e-05 sr=16000 2024-09-10 01:45:16.008018 after remix: shape=(45410,), max=0.587701678276062, min=-0.33347395062446594, mean=-2.629904520290438e-05, sr=16000 2024-09-10 01:45:16.008538 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\25_3.wav' audio.ndim=1 audio.max()=0.5877017 audio.min()=-0.33347395 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:16.035608 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:16.076581 before remix: shape=(46464,), max=0.5337653160095215, min=-0.28030532598495483, mean=-8.611386874690652e-05 sr=16000 2024-09-10 01:45:16.076581 after remix: shape=(46464,), max=0.5337653160095215, min=-0.28030532598495483, mean=-8.611386874690652e-05, sr=16000 2024-09-10 01:45:16.077095 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\26_3.wav' audio.ndim=1 audio.max()=0.5337653 audio.min()=-0.28030533 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:16.098785 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:16.138704 before remix: shape=(41344,), max=0.7884654998779297, min=-0.34653040766716003, mean=-9.784330359252635e-06 sr=16000 2024-09-10 01:45:16.138704 after remix: shape=(41344,), max=0.7884654998779297, min=-0.34653040766716003, mean=-9.784330359252635e-06, sr=16000 2024-09-10 01:45:16.139218 loading sound fname='D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\1_16k_wavs\27_1.wav' audio.ndim=1 audio.max()=0.7884655 audio.min()=-0.3465304 audio.dtype=dtype('float32') sr=16000 2024-09-10 01:45:16.156230 get_f0 rmvpe+ unused params: {} 2024-09-10 01:45:16.228934 Successfully extracted features using rmvpe+ 2024-09-10 01:45:16.231495 write filelist done 2024-09-10 01:45:19.448355 successfully downloaded: D:\ComfyUI_windows_portable\ComfyUI\models\pretrained_v2\G48k.pth 2024-09-10 01:45:51.562381 successfully downloaded: D:\ComfyUI_windows_portable\ComfyUI\models\pretrained_v2\f0Ov2Super40kD.pth Loading faiss with AVX2 support. Could not load library with AVX2 support due to: ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'") Loading faiss. Successfully loaded faiss. 2024-09-10 01:45:51.993591 big_npy.shape=(5663, 768) n_ivf=145 2024-09-10 01:45:51.994113 training index 2024-09-10 01:45:52.162631 adding index 2024-09-10 01:45:52.188331 saved index file to D:\ComfyUI_windows_portable\ComfyUI\models\RVC.index\encylopedia_v2_32k_8edfffef4005fb6e2f456e4869495935.index 2024-09-10 01:45:52.188846 Starting training with model path: D:\ComfyUI_windows_portable\ComfyUI\models\RVC\encylopedia_32k.pth [START] Security scan WARNING: Ignoring invalid distribution ~orch (D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages) [DONE] Security scan Failed to execute startup-script: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Manager\prestartup_script.py / module 'main' has no attribute 'file'

Prestartup times for custom nodes: 0.0 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\rgthree-comfy 0.0 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Easy-Use 4.1 seconds (PRESTARTUP FAILED): D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Manager

Total VRAM 24563 MB, total RAM 32506 MB pytorch version: 2.4.1+cu124 Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 4090 : native Using pytorch cross attention unknown args: ['--windows-standalone-build', '--listen', '--enable-cors-header', '--disable-auto-launch', '--max-upload-size', '100', '--use-pytorch-cross-attention', '--fast'] Found GPU NVIDIA GeForce RTX 4090 please download ffmpeg-static and export to FFMPEG_PATH. For example: export FFMPEG_PATH=/musetalk/ffmpeg-4.4-amd64-static name='Comfy-RVC.training_cli' {'train': {'log_interval': 200, 'seed': 1234, 'epochs': 20000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 4, 'fp16_run': True, 'lr_decay': 0.999875, 'segment_size': 12800, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45.0, 'c_kl': 1.0, 'num_workers': 1, 'c_fm': 2.0, 'c_mfcc': 0.0, 'c_gp': 0.0, 'c_lfcc': 0.0, 'c_hd': 0.0, 'c_sts': 0.0}, 'data': {'max_wav_value': 32768.0, 'sampling_rate': 32000, 'filter_length': 1024, 'hop_length': 320, 'win_length': 1024, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': None, 'training_files': 'D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568\filelist.txt'}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [10, 8, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [20, 16, 4, 4], 'use_spectral_norm': False, 'gin_channels': 256, 'spk_embed_dim': 109}, 'experiment_dir': 'D:\ComfyUI_windows_portable\ComfyUI\output\dataset\5a85cabf537b7c5b85003cb6c2e6d568', 'model_dir': 'D:\ComfyUI_windows_portable\ComfyUI\output\logs\329eb0e6dc6a51427cd477d318cc183f', 'save_every_epoch': 0, 'name': 'encylopedia', 'total_epoch': 100, 'pretrainG': 'D:\ComfyUI_windows_portable\ComfyUI\models\pretrained_v2\G48k.pth', 'pretrainD': 'D:\ComfyUI_windows_portable\ComfyUI\models\pretrained_v2\f0Ov2Super40kD.pth', 'version': 'v2', 'gpus': '0', 'sample_rate': '32k', 'if_f0': True, 'if_latest': True, 'save_every_weights': False, 'if_cache_data_in_gpu': True, 'save_best_model': True, 'best_model_threshold': 50, 'log_every_epoch': 1.0, 'model_path': 'D:\ComfyUI_windows_portable\ComfyUI\models\RVC\encylopedia_32k.pth'} Process Process-1: Traceback (most recent call last): File "multiprocessing\process.py", line 314, in _bootstrap File "multiprocessing\process.py", line 108, in run File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\Comfy-RVC\training_cli.py", line 155, in run dist.init_process_group( File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\distributed\c10d_logger.py", line 79, in wrapper return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\distributed\c10d_logger.py", line 93, in wrapper func_return = func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\distributed\distributed_c10d.py", line 1361, in init_process_group store, rank, world_size = next(rendezvous_iterator) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\distributed\rendezvous.py", line 258, in _env_rendezvous_handler store = _create_c10d_store(master_addr, master_port, rank, world_size, timeout, use_libuv) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\distributed\rendezvous.py", line 185, in _create_c10d_store return TCPStore( ^^^^^^^^^ RuntimeError: use_libuv was requested but PyTorch was build without libuv support 2024-09-10 01:46:09.636670 Training completed. Model saved at: D:\ComfyUI_windows_portable\ComfyUI\models\RVC\encylopedia_32k.pth 2024-09-10 01:46:09.636670 Model file does not exist: D:\ComfyUI_windows_portable\ComfyUI\models\RVC\encylopedia_32k.pth !!! Exception during processing !!! Failed to train model D:\ComfyUI_windows_portable\ComfyUI\models\RVC\encylopedia_32k.pth... Traceback (most recent call last): File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 317, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 192, in get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 169, in _map_node_over_list process_inputs(input_dict, i) File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 158, in process_inputs results.append(getattr(obj, func)(inputs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\Comfy-RVC\custom_nodes\rvc_nodes.py", line 497, in train_model assert os.path.isfile(model_path), f"Failed to train model {model_path}..." ^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError: Failed to train model D:\ComfyUI_windows_portable\ComfyUI\models\RVC\encylopedia_32k.pth...

Prompt executed in 70.55 seconds

SayanoAI commented 1 week ago

Looks like you ran into this issue: https://github.com/RVC-Boss/GPT-SoVITS/issues/1357