OpenPecha / stt-split-audio

MIT License
0 stars 0 forks source link

STT0049: stats of data acquired and trashed from audio split for each department. #19

Open gangagyatso4364 opened 3 months ago

gangagyatso4364 commented 3 months ago

Description

We need to find the stats of how much data we are losing from split audio function from each department. This will help us understand how many hours of data we are actually transcribing from the original audio duration that we have. if the loss is too big then we might need to update the split audio function.

here is the google sheet of daily upload to stt pecha tools: stt data upload stat updating spilt audio stats in sheet 3

Completion Criteria

A stats showing data loss during split audio function.

Implementation

Image

Subtask

Output

Image

gangagyatso4364 commented 3 months ago

there are inconsistency in catalog audio id due to use of multiple catalog between ml engineer and team lead of audio annotator. PC and AB