Rename `synthesize`'s sub commands to `X-to-audio` for a more intuitive name

everyvoice synthesize's subcommands should be renamed:

text-to-wav => text-to-audio
spec-to-wav => spec-to-audio

to make the subcommands clearer, more general and less showing the underlying technical intricacies of the implementation.

everyvoice synthesize  --help

 Usage: everyvoice synthesize [OPTIONS] COMMAND [ARGS]...

 ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
 ┃                                   Synthesize Help                                   ┃
 ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

  • text-to-spec --- this is the most common model to run for performing normal speech
    synthesis.
  • spec-to-wav --- this is the model that turns your spectral features into audio. this
    type of synthesis is also known as copy synthesis and unless you know what you are
    doing, you probably don't want to do this.

╭─ Options ─────────────────────────────────────────────────────────────────────────────╮
│ --help  -h        Show this message and exit.                                         │
╰───────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ────────────────────────────────────────────────────────────────────────────╮
│ text-to-wav  Given some text and a trained model, generate some audio. i.e. perform   │
│              typical speech synthesis                                                 │
│ spec-to-wav  Given some Mel spectrograms and a trained model, generate some audio.    │
│              i.e. perform copy synthesis                                              │
╰───────────────────────────────────────────────────────────────────────────────────────╯

EveryVoiceTTS / EveryVoice

Rename `synthesize`'s sub commands to `X-to-audio` for a more intuitive name #181