ContextLab / quail

A python toolbox for analyzing and plotting free recall data
http://cdl-quail.readthedocs.io/en/latest/
MIT License
20 stars 10 forks source link

output text file is blank when decoding file longer than 1 minute and `save=True` #117

Open paxtonfitzpatrick opened 5 years ago

paxtonfitzpatrick commented 5 years ago

in decode_speech.py, line 242 pd.DataFrame(parsed_results).to_csv(f + '.txt', header=False, index=False) saves out a blank text file.

The looks like the results from the google cloud speech-to-text are being parsed properly. Lines 154-57

for result in chunk.results:
    alternative = result.alternatives[0]
    print('Transcript: {}'.format(alternative.transcript))
    print('Confidence: {}'.format(alternative.confidence))

print the transcripts and confidences properly and line 239 pickle.dump(results, open(f + ".p", "wb" ) ) saves out the pickled results object.

Could be an issue in pd.DataFrame.to_csv() inferring headers (header=Falsein line 242) or could be a suppressed error in line 229parsed_results = parse_response(results) when enable_word_time_offsets=False. Will look into this later.

Code:

audiodir = os.path.abspath('../../data/audio/')
keypath = os.path.abspath('../../../google-credentials/cloud-speech-credentials.json')

for sid, data in rand5_turkids.items():
    print(sid + ':')
    for ses, turkid in data.items():

        if ses == 'session 1':
            audiofiles = [turkid+'-recall.wav', turkid+'-prediction.wav']
            ep_context = ep1_speech_context
        elif ses == 'session 2':
            audiofiles = [turkid+'-delayed.wav', turkid+'-recall.wav']
            if 'A' in sid:
                ep_context = ep2_speech_context
            elif 'B' in sid:
                ep_context = ep3_speech_context
            else:
                print('bad SID: ' + sid)
                continue
        else:
            print('session ID unrecognized for ' + sid + ', ' + ses)
            continue

        for audiofile in audiofiles:

                path = os.path.join(audiodir,'room1',turkid, audiofile)
                if not os.path.isfile(path):
                    path = os.path.join(audiodir,'room2',turkid, audiofile)
                    if not os.path.isfile(path):
                        print('AUDIO FILE NOT FOUND: ' + path)
                        continue

                print('\tdecoding ' + ses + ': ' + audiofile + ' ...')
                quail.decode_speech(path=path, keypath=keypath, save=True, speech_context=ep_context, enable_word_time_offsets=False)