gitmylo / audio-webui

A webui for different audio related Neural Networks
MIT License
1.01k stars 94 forks source link

The size of tensor a (28) must match the size of tensor b (33) at non-singleton dimension 2 error #116

Closed diogoalmeida1991 closed 1 year ago

diogoalmeida1991 commented 1 year ago

Describe the bug When I train the error "The size of tensor a (28) must match the size of tensor b (33) at non-singleton dimension 2" happen.

To Reproduce Steps to reproduce the behavior:

  1. Go to Train
  2. Create Workspace v1 48 and Workspace v 2 48
  3. Click on resample and split dataset
  4. After the conclusion I choose harvest and filter radius 3
  5. Click on Extract Pitches
  6. Click in create index file.
  7. Click in Train.
  8. Choose 100 Epochs to train
  9. Save every n epochs 1.

Screenshots image image

Additional context The Variable xs on the line 511 file "webui/modules/implementations/rvc/infer_pack/models.py" was generating NAN. https://github.com/gitmylo/audio-webui/blob/0703fac37dd1d297defe78dad3a7a3f9381d8c45/webui/modules/implementations/rvc/infer_pack/models.py#L511C25-L511C25

The wav file usad has the following info: Duration: 00:33:27.26, bitrate: 1536 kb/s Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 2 channels, s16, 1536 kb/s [STREAM] index=0 codec_name=pcm_s16le codec_long_name=PCM signed 16-bit little-endian profile=unknown codec_type=audio codec_tag_string=[1][0][0][0] codec_tag=0x0001 sample_fmt=s16 sample_rate=48 KHz channels=2 channel_layout=unknown bits_per_sample=16 initial_padding=0 id=N/A r_frame_rate=0/0 avg_frame_rate=0/0 time_base=1/48000 start_pts=N/A start_time=N/A duration_ts=96348672 duration=0:33:27.264000 bit_rate=1.536000 Mbit/s max_bit_rate=N/A bits_per_raw_sample=N/A nb_frames=N/A nb_read_frames=N/A nb_read_packets=N/A DISPOSITION:default=0 DISPOSITION:dub=0 DISPOSITION:original=0 DISPOSITION:comment=0 DISPOSITION:lyrics=0 DISPOSITION:karaoke=0 DISPOSITION:forced=0 DISPOSITION:hearing_impaired=0 DISPOSITION:visual_impaired=0 DISPOSITION:clean_effects=0 DISPOSITION:attached_pic=0 DISPOSITION:timed_thumbnails=0 DISPOSITION:captions=0 DISPOSITION:descriptions=0 DISPOSITION:metadata=0 DISPOSITION:dependent=0 DISPOSITION:still_image=0 [/STREAM] [FORMAT] filename=C:\Users\User\Music\x\x.wav nb_streams=1 nb_programs=0 format_name=wav format_long_name=WAV / WAVE (Waveform Audio) start_time=N/A duration=0:33:27.264000 size=367.541058 Mibyte bit_rate=1.536000 Mbit/s probe_score=99

diogoalmeida1991 commented 1 year ago

Solved here, the archive is 48k, but 40k works it. Strange, but work!