bshall / knn-vc

Voice Conversion With Just Nearest Neighbors
https://bshall.github.io/knn-vc/
Other
431 stars 64 forks source link

prematch_dataset run very slow #15

Closed fangg2000 closed 1 year ago

fangg2000 commented 1 year ago

` +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.105.01 Driver Version: 515.105.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:1C:00.0 On | N/A | | 0% 46C P8 18W / 170W | 3902MiB / 12288MiB | 2% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1362 G /usr/lib/xorg/Xorg 53MiB | | 0 N/A N/A 2014 G /usr/lib/xorg/Xorg 119MiB | | 0 N/A N/A 2144 G /usr/bin/gnome-shell 54MiB | | 0 N/A N/A 384873 C python 3659MiB | +-----------------------------------------------------------------------------+ `

2023-07-06 22-42-03屏幕截图

It takes up video memory, but it should not be used, and the power has not increased. Is there a problem?

fangg2000 commented 1 year ago

(knn) root@fangg-MS-7B78:/home/fangg/tts/knn-vc-master# python prematch_dataset.py --librispeech_path /home/fangg/tts/save_voice/tmp --out_path data_splits/voise --topk 4 --matching_layer 6 --synthesis_layer 6 Matching weightings: tensor([0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], device='cuda:0') Synthesis weightings: tensor([0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], device='cuda:0') [LIBRISPEECH] Computing folders ['train-clean-100', 'dev-clean'] Loading wavlm. WavLM-Large loaded with 315,453,120 parameters. Feature has shape: torch.Size([355, 1024])--------------------------------| 0.00% [0/1008 00:00<00:00] Done 0/1,008 Feature has shape: torch.Size([54, 1024])-------------| 0.10% [1/1008 01:02<17:28:24 train-clean-100/voise/pre/voise-pre-11911.flac] Feature has shape: torch.Size([277, 1024])------------| 0.20% [2/1008 01:56<16:17:57 train-clean-100/voise/pre/voise-pre-12097.flac] |████████████-----------------------------------------| 24.11% [243/1008 11:22<35:49

there must be something wrong

fangg2000 commented 1 year ago

or problem

fangg2000 commented 1 year ago

(knn) root@fangg-MS-7B78:/home/fangg/tts/knn-vc-master# python prematch_dataset.py --librispeech_path /home/fangg/tts/save_voice/tmp --out_path /home/fangg/tts/save_voice/tmp --topk 4 --matching_layer 6 --synthesis_layer 6 Matching weightings: tensor([0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], device='cuda:0') Synthesis weightings: tensor([0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], device='cuda:0') [LIBRISPEECH] Computing folders ['train-clean-100', 'dev-clean'] Loading wavlm. WavLM-Large loaded with 315,453,120 parameters. Feature has shape: torch.Size([22, 1024])---------------------------| 0.00% [0/2583 00:00<00:00] Done 0/2,583 Feature has shape: torch.Size([55, 1024])------------| 0.04% [1/2583 00:05<3:40:41 train-clean-100/018/pre/018-pre-11006.flac] Feature has shape: torch.Size([32, 1024])------------| 0.08% [2/2583 00:05<2:00:22 train-clean-100/018/pre/018-pre-11009.flac] Done 1,000/2,583█████-------------------------------| 38.71% [1000/2583 02:52<04:33 train-clean-100/074/pre/074-pre-12003.flac] Done 2,000/2,583████████████████████████------------| 77.43% [2000/2583 03:37<01:03 train-clean-100/076/pre/076-pre-12999.flac] All done!███████████████████████████████████████████| 100.00% [2583/2583 04:01<00:00 dev-clean/015/pre/015-pre-13138.flac]flac] (knn) root@fangg-MS-7B78:/home/fangg/tts/knn-vc-master# (knn) root@fangg-MS-7B78:/home/fangg/tts/knn-vc-master# (knn) root@fangg-MS-7B78:/home/fangg/tts/knn-vc-master#

But this time there is no problem, why?

RF5 commented 1 year ago

Hi @fangg2000 , there was a minor typo in the prematching script which caused some unexpected behaviour (see #16 ) -- I committed a fix now for it and it should operate as expected.

Note that during prematching we do compute WavLM features, so it will use GPU memory, and the loading of files can take quite a long time if your disk is slow and dataset is massive, but it shouldn't take extremely long.

Hope that helps!