[Help]: Requesting some guidance / documentation on choosing appropriate parameters for mssbcqt

Hello! I'd be very glad if I could get some more information how to adapt mssbcqt discriminator for 48khz audio.

Lately I've been trying to improve the current architecture of RVC ( retrieval-based-voice-conversion ) by adopting ms-sb-cqt and ms-stft discriminators however from what I can see, it was tested on ( and supposedly the config is for ) 24khz audio. Essentially, I am interested in receiving some guidance on how to properly decide on params for cqt.:

        filters=32,
        max_filters=1024,
        filters_scale=1,
        dilations=[1, 2, 4],
        in_channels=1,
        out_channels=1,
        hop_lengths= [512, 256, 256],
        n_octaves=[9, 9, 9],
        bins_per_octaves=[24, 36, 48],

For more details, this is the current config I use for training pretrained models for rvc:

   },
  "data": {
    "max_wav_value": 32768.0,
    "sampling_rate": 48000,
    "filter_length": 2048,
    "hop_length": 480,
    "win_length": 2048,
    "n_mel_channels": 128,
    "mel_fmin": 0.0,
    "mel_fmax": null
  },
  "model": {
    "inter_channels": 192,
    "hidden_channels": 192,
    "filter_channels": 768,
    "n_heads": 2,
    "n_layers": 6,
    "kernel_size": 3,
    "p_dropout": 0,
    "resblock": "1",
    "resblock_kernel_sizes": [3,7,11],
    "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
    "upsample_rates": [12,10,2,2],
    "upsample_initial_channel": 512,
    "upsample_kernel_sizes": [24,20,4,4],
    "use_spectral_norm": false,
    "gin_channels": 256,
    "spk_embed_dim": 109
  }
}

As an important note: I intend to pair mssbcqt / msstft combo along with the existing MultiPeriodDiscriminator used in RVC. Kindly thank you in advance!

open-mmlab / Amphion

[Help]: Requesting some guidance / documentation on choosing appropriate parameters for mssbcqt #351