We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
Other
1.25k
stars
179
forks
source link
RuntimeError: Calculated padded input size per channel: (0). Kernel size: (8). Kernel size can't be greater than actual input size #102
training the model from the given samples in the repo encountring this error
and if i changes the L vaues from the config.py its crashed
/media/wesee/nv_ssd/meta/svoice/train.py:115: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path="conf", config_name='config.yaml')
/home/wesee/miniforge3/lib/python3.10/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'config.yaml': Defaults list is missing _self_. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/default_composition_order for more information
warnings.warn(msg, UserWarning)
/home/wesee/miniforge3/lib/python3.10/site-packages/hydra/_internal/defaults_list.py:415: UserWarning: In config.yaml: Invalid overriding of hydra/job_logging:
Default list overrides requires 'override' keyword.
See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/defaults_list_override for more information.
deprecation_warning(msg)
/home/wesee/miniforge3/lib/python3.10/site-packages/hydra/_internal/defaults_list.py:415: UserWarning: In config.yaml: Invalid overriding of hydra/hydra_logging:
Default list overrides requires 'override' keyword.
See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/defaults_list_override for more information.
deprecation_warning(msg)
/home/wesee/miniforge3/lib/python3.10/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
[2024-09-05 14:35:31,039][main][INFO] - For logs, checkpoints and samples check /media/wesee/nvssd/meta/svoice/outputs/exp
[2024-09-05 14:35:34,635][main][INFO] - Running on host ubuntu
[2024-09-05 14:35:36,478][svoice.solver][INFO] - ----------------------------------------------------------------------
[2024-09-05 14:35:36,479][svoice.solver][INFO] - Training...
Input shape to Encoder: torch.Size([4, 32000])
[2024-09-05 14:35:44,215][svoice.solver][INFO] - Train | Epoch 1 | 1/2 | 0.3 it/sec | Loss 24.78669
Input shape to Encoder: torch.Size([4, 32000])
[2024-09-05 14:35:49,408][svoice.solver][INFO] - Train | Epoch 1 | 2/2 | 0.2 it/sec | Loss 20.88289
[2024-09-05 14:35:49,412][svoice.solver][INFO] - Train Summary | End of Epoch 1 | Time 12.93s | Train Loss 20.88289
[2024-09-05 14:35:49,413][svoice.solver][INFO] - ----------------------------------------------------------------------
[2024-09-05 14:35:49,414][svoice.solver][INFO] - Cross validation...
Input shape to Encoder: torch.Size([1, 0])
[2024-09-05 14:35:49,591][main][ERROR] - Some error happened
Traceback (most recent call last):
File "/media/wesee/nv_ssd/meta/svoice/train.py", line 118, in main
_main(args)
File "/media/wesee/nv_ssd/meta/svoice/train.py", line 112, in _main
run(args)
File "/media/wesee/nv_ssd/meta/svoice/train.py", line 93, in run
solver.train()
File "/media/wesee/nv_ssd/meta/svoice/svoice/solver.py", line 131, in train
valid_loss = self._run_one_epoch(epoch, cross_valid=True)
File "/media/wesee/nv_ssd/meta/svoice/svoice/solver.py", line 195, in _run_one_epoch
estimate_source = self.dmodel(mixture)
File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, *kwargs)
File "/media/wesee/nv_ssd/meta/svoice/svoice/models/swave.py", line 252, in forward
mixture_w = self.encoder(mixture)
File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, kwargs)
File "/media/wesee/nv_ssd/meta/svoice/svoice/models/swave.py", line 281, in forward
mixture_w = F.relu(self.conv(mixture))
File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(args, kwargs)
File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 310, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 306, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,
RuntimeError: Calculated padded input size per channel: (0). Kernel size: (8). Kernel size can't be greater than actual input size
training the model from the given samples in the repo encountring this error and if i changes the L vaues from the config.py its crashed
/media/wesee/nv_ssd/meta/svoice/train.py:115: UserWarning: The version_base parameter is not specified. Please specify a compatability version level, or None. Will assume defaults for version 1.1 @hydra.main(config_path="conf", config_name='config.yaml') /home/wesee/miniforge3/lib/python3.10/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'config.yaml': Defaults list is missing
_self_
. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/default_composition_order for more information warnings.warn(msg, UserWarning) /home/wesee/miniforge3/lib/python3.10/site-packages/hydra/_internal/defaults_list.py:415: UserWarning: In config.yaml: Invalid overriding of hydra/job_logging: Default list overrides requires 'override' keyword. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/defaults_list_override for more information.deprecation_warning(msg) /home/wesee/miniforge3/lib/python3.10/site-packages/hydra/_internal/defaults_list.py:415: UserWarning: In config.yaml: Invalid overriding of hydra/hydra_logging: Default list overrides requires 'override' keyword. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/defaults_list_override for more information.
deprecation_warning(msg) /home/wesee/miniforge3/lib/python3.10/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default. See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information. ret = run_job( [2024-09-05 14:35:31,039][main][INFO] - For logs, checkpoints and samples check /media/wesee/nvssd/meta/svoice/outputs/exp [2024-09-05 14:35:34,635][main][INFO] - Running on host ubuntu [2024-09-05 14:35:36,478][svoice.solver][INFO] - ---------------------------------------------------------------------- [2024-09-05 14:35:36,479][svoice.solver][INFO] - Training... Input shape to Encoder: torch.Size([4, 32000]) [2024-09-05 14:35:44,215][svoice.solver][INFO] - Train | Epoch 1 | 1/2 | 0.3 it/sec | Loss 24.78669 Input shape to Encoder: torch.Size([4, 32000]) [2024-09-05 14:35:49,408][svoice.solver][INFO] - Train | Epoch 1 | 2/2 | 0.2 it/sec | Loss 20.88289 [2024-09-05 14:35:49,412][svoice.solver][INFO] - Train Summary | End of Epoch 1 | Time 12.93s | Train Loss 20.88289 [2024-09-05 14:35:49,413][svoice.solver][INFO] - ---------------------------------------------------------------------- [2024-09-05 14:35:49,414][svoice.solver][INFO] - Cross validation... Input shape to Encoder: torch.Size([1, 0]) [2024-09-05 14:35:49,591][main][ERROR] - Some error happened Traceback (most recent call last): File "/media/wesee/nv_ssd/meta/svoice/train.py", line 118, in main _main(args) File "/media/wesee/nv_ssd/meta/svoice/train.py", line 112, in _main run(args) File "/media/wesee/nv_ssd/meta/svoice/train.py", line 93, in run solver.train() File "/media/wesee/nv_ssd/meta/svoice/svoice/solver.py", line 131, in train valid_loss = self._run_one_epoch(epoch, cross_valid=True) File "/media/wesee/nv_ssd/meta/svoice/svoice/solver.py", line 195, in _run_one_epoch estimate_source = self.dmodel(mixture) File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "/media/wesee/nv_ssd/meta/svoice/svoice/models/swave.py", line 252, in forward mixture_w = self.encoder(mixture) File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "/media/wesee/nv_ssd/meta/svoice/svoice/models/swave.py", line 281, in forward mixture_w = F.relu(self.conv(mixture)) File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 310, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/wesee/miniforge3/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 306, in _conv_forward return F.conv1d(input, weight, bias, self.stride, RuntimeError: Calculated padded input size per channel: (0). Kernel size: (8). Kernel size can't be greater than actual input size
how to deal with this