ngtban / wavenet_de_data_prep

GNU Affero General Public License v3.0
0 stars 0 forks source link

Label audio clips: speaker, transcription, etc. #4

Closed ngtban closed 2 years ago

ngtban commented 3 years ago

See #2

Checklist for reviewing the extracted and labelled data:

I need 2, 3, and 4 to be fulfilled as I plan to simply ignore audio clips without transcription in the task used for generating data for ESP. If one of the criterion is incorrect then ignoring audio clips without transcription means missing valid data.

I realized that writing a script for 2 and 3 is not necessary, as ensuring 4 is enough for the current implementation. I will need to add a script for 3 if I do not ignore audio clips without a corresponding dialogue entry like right now. For 2 I will need to actually implement graph navigation based on the code fragments in dialogue entries, moreover currently all such dialogue entries do not have any text.