Open ggbangyes opened 1 year ago
I read the readme in the assets,it's complicated but can we use encodec and bert to create our own semantic_prompt,coarse_prompt,fine_prompt? can you tell more information . I need to use my mp3 to get a Speech-to-Text text and through bert to generate my semantic prompts ,and then get through encodec to get coarse and fine . AM I right?
How to transfer a wav format voice file into the npz format prompt? I record my voice and want to use my voice to read the text.