gl_speech output is incomplete #37

Closed kgarnick closed 6 years ago

kgarnick commented 6 years ago

Hi Mark,

Having a bit of a strange issue -- I can get gl_speech to run with the code below, but it seems to cut the transcript short.:

library(googleLanguageR) library(tuneR)

a <- readWave("OSR_us_000_0018_8k.wav", from = 0, to = 1, units = "minutes")
b <- mono(a)
writeWave(b, "OSR_us_000_0018_8k_new.wav", extensible = FALSE)
text <- gl_speech("OSR_us_000_0018_8k_new.wav", sampleRateHertz = b@samp.rate)

The resulting transcript is correct, but only represents the first ~6 seconds of the new wav file. I've listened to the new file, and it contains speech for at least 30 seconds. I can replicate this issue with a different wav file. Any insight? Thanks!

MarkEdmondson1234 commented 6 years ago

Could you try with the GitHub version if you haven't already? There was this issue, but perhaps not related

kgarnick commented 6 years ago

Yes, I'm running it the version installed via devtools::install_github("MarkEdmondson1234/googleLanguageR")

MarkEdmondson1234 commented 6 years ago

Ok second reaction is perhaps the audio preprocessing is doing unexpected things, so you can play it but thats not what the API is seeing. Would you have a copy of the file I could use to debug?

I think v0.2.0 I should include some audio preprocessing functions to help with this, as it seems like a tricky thing.

kgarnick commented 6 years ago

Yes, it seems likely the preprocessing is the issue. I downloaded the wav from here. Thanks for the quick replies!

kgarnick commented 6 years ago

Hey Mark,

I tested this with a longer wav. I broke it into 30 second chunks and combined the transcript output -- it cuts every 30 second chunk short. I'll do as much as I can to help solve this, but any help is much appreciated!

MarkEdmondson1234 commented 6 years ago

Hmm can I see some code and the exact audio file you are using, since this file (first on the list) transcribes ok.

My code:

gl_speech("OSR_us_000_0010_8k.wav", sampleRateHertz = 8000L)

Which produces:

kgarnick commented 6 years ago

Sure. Seems to stop after the first line:

library(googleLanguageR) library(tuneR) gl_auth("speech2text-3f89d34ff4a9.json") wav_header <- readWave("OSR_us_000_0010_8k.wav", header = TRUE) transcript <- gl_speech("OSR_us_000_0010_8k.wav", sampleRateHertz = wav_header$sample.rate) transcript$transcript

[1] "the Birch canoes lid on the smooth planks"


[[1]] startTime endTime word 1 0.200s 0.700s the 2 0.700s 0.900s Birch 3 0.900s 1.500s canoes 4 1.500s 1.900s lid 5 1.900s 2s on 6 2s 2.200s the 7 2.200s 2.500s smooth 8 2.500s 3s planks


MarkEdmondson1234 commented 6 years ago

I think you are installing from the wrong GitHub repo :)


This version is


Run remotes::install_github("ropensci/googleLanguageR") and not remotes::install_github("MarkEdmondson1234/googleLanguageR") - I'll remove that one ASAP....

kgarnick commented 6 years ago

Perfect! I can't thank you enough for your help and your work on this awesome package.