CyberLykan commented 2 years ago

I attempted to improve DeepAFx-ST. Here's what I did.

Download the zip from https://github.com/adobe-research/DeepAFx-ST and extract it.

Open Notepad++, press CTRL+SHIFT+F, find 24000, replace 44100, set the directory, replace in files.

At this point you can safely add the checkpoints and examples.

Edit scripts/process.py Replace x_44100 = torch.tensor(resampy.resample(x.view(-1).numpy(), x_sr, 44100)) with x_44100 = torch.tensor(resampy.resample(x.reshape(-1).numpy(), x_sr, 44100)) Under x_44100 = x_44100.view(1, -1) insert x_44100 = x_44100[0:1, : x_44100.shape[-1] // 2] Under x_44100 = x insert x_44100 = x_44100[0:1, : x_44100.shape[-1]] Replace r_44100 = torch.tensor(resampy.resample(r.view(-1).numpy(), r_sr, 44100)) with r_44100 = torch.tensor(resampy.resample(r.reshape(-1).numpy(), r_sr, 44100)) Under r_44100 = r_44100.view(1, -1) insert r_44100 = r_44100[0:1, : r_44100.shape[-1] // 2] Under r_44100 = r insert r_44100 = r_44100[0:1, : r_44100.shape[-1]]

Remove x_44100 = x_44100[0:1, : 44100 * 5] Remove r_44100 = r_44100[0:1, : 44100 * 5]

Replace filename = os.path.basename(args.input).replace(".wav", "") with filename = os.path.splitext(os.path.basename(args.input))[0] Remove reference = os.path.basename(args.reference).replace(".wav", "") Replace out_filepath = os.path.join(dirname, f"{filename}_out_ref={reference}.wav") with out_filepath = os.path.join(dirname, f"{filename}_DeepAFx-ST.wav") Remove in_filepath = os.path.join(dirname, f"{filename}_in.wav") Remove torchaudio.save(in_filepath, x_44100.cpu().view(1, -1), 44100)

You should be good to go!

It's possible that this approach may have broken some things not related to processing.

CyberLykan commented 2 years ago

24 #22

selyu504 commented 2 years ago

Haw can I get the parameters of EQ and Compressor?

CyberLykan commented 2 years ago

Haw can I get the parameters of EQ and Compressor?

Please use the other issue you made for discussion. Your comment does not fit here.

CyberLykan commented 2 years ago

If your results are getting cut in half or doubled, try experimenting with removing or adding // 2 from both lines.

CyberLykan commented 1 year ago

Seems like there are still a lot of issues with this approach. :/

kelseyjd commented 1 year ago

LibriTTS dataset is only at 24 kHz so you would need to find a new dataset to re-train at 44k

adobe-research / DeepAFx-ST

[Improvement] Increased sample rate to 44100 and added the ability to process entire files. #25

24 #22