ropensci / googleLanguageR

R client for the Google Translation API, Google Cloud Natural Language API and Google Cloud Speech API
https://code.markedmondson.me/googleLanguageR/
Other
192 stars 42 forks source link

gl_talk: language code detection #86

Open retowyss opened 4 months ago

retowyss commented 4 months ago

For text to speech, the Neural2 voices require passing of the full language code (en-GB, en-US, ...) not just the first two letters (en, ...).

Error in `abort_http()`:
! http_400 Requested language code 'en' doesn't match the voice 'en-GB-Neural2-F''s language code 'en-gb'. Either pick a different voice, or change the requested language code to en-gb.

The gl_talk implementation forces the languageCode to be derived from the first two letters of the name parameter, so there's no way to overcome this.

if (!is.null(name)) {
        assert_that(is.string(name))
        languageCode <- substr(name, 1, 2)
        gender <- NULL
    }

I can submit a fix but there are multiple ways this could be resolved.

a) Check whether name contains "Neural", if so grab first 5 characters for languageCode b) add parameter to force supplied languageCode parameter force_languageCode = FALSE (?)

Drawback of a) is it requires code updates to the code for every model that requires the full languageCode.

b) doesn't look elegant, but it will never break existing code. Particularly, it can be passed through text2speech::tts_google's ... parameter.

I'd prefer the function never overwrites the languageCode, but that surely is a breaking change.

retowyss commented 4 months ago

https://github.com/ropensci/googleLanguageR/pull/87

Tested this through

text2speech::tts_google(
  text = "This works now, yay!", 
  voice = "en-GB-Neural2-F", 
  languageCode = "en-gb",
  forceLanguageCode = TRUE
)
MarkEdmondson1234 commented 4 months ago

Thanks, b looks good to prevent breaking changes