bwagner / AMT-Transcripts

The Transcription project for the Art + Music + Technology podcast
0 stars 0 forks source link

How to cope with shifts in audio #13

Open bwagner opened 4 years ago

bwagner commented 4 years ago

When using an offset of 27.664 for episode 303 (Scott Morgan (Loscil)):

./tr-parse.js audio transcript-0303.json -s 'Darwin' 'Scott Morgan (Loscil)' -r 'November 24, 2019' -o 27.664

At the beginning, text and audio are perfectly in sync, but successively, the audio seems to slip ahead, e.g. the monologue

Darwin: I mean, is there something that you personally identify as the thing that makes your music you?

should start at 410.48 according to the json, but in fact already starts around 409.46.

I don't understand why this is happening, as e.g. the text keeps perfectly in sync for

./tr-parse.js audio transcript-0005.json -s 'Darwin' 'Barry Moon' -r 'November 10, 2013' -o 6.1
darwingrosse commented 4 years ago

I think that this is just some variety in playback positioning - especially when using MP3 files. For the next transcription, I'm going to try to use WAV files for the generation and the playback to see if this makes a difference.