intelligentnode / IntelliJava

Integrate with the latest language models, image generation, speech, and deep learning frameworks like ChatGPT, DALL·E, and Cohere using few java lines.
https://show.intellinode.ai/
Apache License 2.0
65 stars 13 forks source link

Great library hope it will add more features #8

Closed eix128 closed 1 year ago

eix128 commented 1 year ago

Hi , thanks for sharing such a library. I want tts for Chinese , Turkish support on code : how to use tts for turkish or chinese or other langs ? model.generateGoogleText seems private field.

    // 1- initiate the remote speech model
    RemoteSpeechModel model = new RemoteSpeechModel("xxxxxxxxxxxxxxxx", SpeechModels.google);
    List<String> supportedModels = model.getSupportedModels();

    // 2- call generateEnglishText with any text
    Text2SpeechInput input = new Text2SpeechInput.Builder("Hi, I am Intelligent Java.").build();

    byte[] decodedAudio = model.generateEnglishText(input); 

Also it would be good to add STT (speech to text) library into this library

I have added patch to select playerCode

public byte[] generateGoogleText(String text, Gender gender , String playerCode , String language ) throws IOException {
    byte[] decodedAudio = null;

    Map<String, Object> params = new HashMap<>();
    params.put("text", text);
    params.put("languageCode", language);

    if (gender == Gender.FEMALE) {
        params.put("name", "en-GB-Standard-A");
        params.put("ssmlGender", "FEMALE");
    } else {
        params.put("name", "en-GB-Standard-B");
        params.put("ssmlGender", "MALE");
    }

    if(playerCode != null) {
        params.put("name",playerCode);
    }

    AudioResponse resModel = (AudioResponse) wrapper.generateSpeech(params);
    decodedAudio = AudioHelper.decode(resModel.getAudioContent());

    return decodedAudio;
}
Barqawiz commented 1 year ago

Thanks for the feedback, the library is updated to support Turkish & Chinese in version 0.8.2.

Let me know if the speech model works as expected.

Can you elaborate the intended use of the playerCode ?

eix128 commented 1 year ago

Its used for selecting the voice names: for turkish example : "tr-TR-Wavenet-E"

Look at all: https://cloud.google.com/text-to-speech/docs/voices

Barqawiz commented 1 year ago

Ok thanks, you mean the voice name.

For now you can send the language code for Turkish and Chinese through following functions:

And in the next versions will provide a more flexible approach.