Open baadshah2020 opened 1 year ago
Hi @baadshah2020 ,
I think it is the issue with the sentence tokenizer: For example: But with Hindi (from Wikipedia)
So instead of NLTK, you probably want to use sth like: https://github.com/goru001/inltk instead.
Or, just split yourself:
script_chunks = script.split(" ")
n_tokens = 30
sentences = [" ".join(chunk) for chunk in [script_chunks[i:i + n_tokens] for i in range(0, len(script_chunks), n_tokens)]
]
sentences
Hi I tried this code [https://github.com/suno-ai/bark/blob/main/notebooks/long_form_generation.ipynb]. To generate an audio on large text. But unfortunately, I couldn't generate an audio beyond 14 sec of length for Hindi text. Can anyone suggest , what I could do to generate an audio of length more than 14 sec for a long Hindi text Thanks.