ContextLab / quail

A python toolbox for analyzing and plotting free recall data
http://cdl-quail.readthedocs.io/en/latest/
MIT License
20 stars 10 forks source link

Decode long #115

Closed paxtonfitzpatrick closed 5 years ago

paxtonfitzpatrick commented 5 years ago

fix for issue #113

paxtonfitzpatrick commented 5 years ago

verified working:

import quail
transcription = quail.decode_speech('/Users/paxtonfitzpatrick/Desktop/example/jin-1542408163.706.wav', save=True, sample_rate=48000, return_raw=True)
Decoding file 1 of 1
Audio clip is longer than 1 minute.  Splitting into 2 one minute segments...
Transcript: Sistine Chapel Sydney Opera House
Confidence: 0.9402961730957031
Word: Sistine, start_time: 1.4, end_time: 2.2
Word: Chapel, start_time: 2.2, end_time: 2.3
Word: Sydney, start_time: 2.3, end_time: 3.9
Word: Opera, start_time: 3.9, end_time: 4.3
Word: House, start_time: 4.3, end_time: 4.7
Transcript:  Big Ben Morgan Freeman
Confidence: 0.9603950381278992
Word: Big, start_time: 6.3, end_time: 6.8
Word: Ben, start_time: 6.8, end_time: 7.3
Word: Morgan, start_time: 7.3, end_time: 8.8
Word: Freeman, start_time: 8.8, end_time: 9.5
Transcript:  Usher Alicia Keys
Confidence: 0.9427340030670166
Word: Usher, start_time: 10.4, end_time: 11.0
Word: Alicia, start_time: 11.0, end_time: 12.2
Word: Keys, start_time: 12.2, end_time: 12.6
Transcript:  stamp highlighter
Confidence: 0.9753211140632629
Word: stamp, start_time: 14.6, end_time: 15.4
Word: highlighter, start_time: 15.4, end_time: 17.1
Transcript:  Eminem Lindsay Lohan
Confidence: 0.9725711345672607
Word: Eminem, start_time: 21.4, end_time: 22.4
Word: Lindsay, start_time: 22.4, end_time: 24.0
Word: Lohan, start_time: 24.0, end_time: 24.5
Transcript:  Keira Knightley
Confidence: 0.9709164500236511
Word: Keira, start_time: 25.6, end_time: 26.3
Word: Knightley, start_time: 26.3, end_time: 27.0
Transcript:  Ashton club culture culture
Confidence: 0.8209373354911804
Word: Ashton, start_time: 28.0, end_time: 28.8
Word: club, start_time: 28.8, end_time: 29.0
Word: culture, start_time: 29.0, end_time: 29.4
Word: culture, start_time: 29.4, end_time: 30.3
Transcript:  Mall of America
Confidence: 0.9636169672012329
Word: Mall, start_time: 32.3, end_time: 32.8
Word: of, start_time: 32.8, end_time: 32.9
Word: America, start_time: 32.9, end_time: 33.5
Transcript:  Timbuktu
Confidence: 0.8676044344902039
Word: Timbuktu, start_time: 34.6, end_time: 35.6
Transcript:  Carnegie Hall
Confidence: 0.9876291155815125
Word: Carnegie, start_time: 37.5, end_time: 38.3
Word: Hall, start_time: 38.3, end_time: 38.4
Transcript:  White House
Confidence: 0.8950126767158508
Word: White, start_time: 39.6, end_time: 40.1
Word: House, start_time: 40.1, end_time: 40.6
Transcript:  flat salts
Confidence: 0.918372631072998
Word: flat, start_time: 45.2, end_time: 45.8
Word: salts, start_time: 45.8, end_time: 46.5
Transcript:  ketchup
Confidence: 0.9183732271194458
Word: ketchup, start_time: 50.4, end_time: 51.2
Transcript: whistle
Confidence: 0.9530355930328369
Word: whistle, start_time: 3.2, end_time: 4.0
Transcript:  Snoop Dogg
Confidence: 0.9644667506217957
Word: Snoop, start_time: 14.5, end_time: 15.1
Word: Dogg, start_time: 15.1, end_time: 15.4
Finished file 1 of 1 in 19.94 seconds.
andrewheusser commented 5 years ago

I think this is good to merge into master for now so people can install from github. Agree that it's worth holding off on pushing to pip until the latest changes can be documented and tested

jeremymanning commented 5 years ago

merging...