Tfkalk / MultispeakerTranscription

Script to do multispeaker transcription with immediate transcript deletion
0 stars 0 forks source link

Batch Processing Attempts to Transcribe Non-Audio Files #7

Open Tfkalk opened 1 month ago

Tfkalk commented 1 month ago

Currently batch processing attempts to process all files in the given directory, even hidden ones (macOS's .DS_store is currently deny listed). Attempting to process a non-audio file causes the transcription to error out. We can make this more robust by checking for the file type beforehand.

Accepted file types, per FAQ: .3ga .webm .8svx
.mts, .m2ts, .ts .aac
.mov .ac3 .mp2 .aif
.mp4, .m4p (with DRM), .m4v .aiff
.mxf .alac
.amr
.ape
.au .dss
.flac
.flv
.m4a
.m4b
.m4p
.m4r
.mp3
.mpga
.ogg, .oga, .mogg
.opus
.qcp
.tta
.voc
.wav
.wma
.wv

anirudh1117 commented 1 month ago

Hi @Tfkalk Can i start working on this, i have read the scripts and found

if file != ".DS_Store": transcribe_file(path, file)

check should be added here for the accepted file type.