OpenPecha / stt-catalog-merger

MIT License
0 stars 0 forks source link

STT0003: Parse metadata from Catalog (5) #6

Open spsither opened 6 months ago

spsither commented 6 months ago

Description

The metadata for all the departments exist in various formats in their catalog Google Sheets. We need scripts to parse those sheets and get all the metadata associated with audio files.

Completion Criteria

Get the metadata in the TSV/CSV file ready to be consumed on evaluation of the model.


Implementation Plan

Image

Subtasks

gangagyatso4364 commented 6 months ago

additional subtask of creating a meta data dictionary containing the columns to be added in main csv file for each department google sheets.

gangagyatso4364 commented 6 months ago

i have to reformat the google sheets by deleting unnecessary columns which do not contribute to the data information gain.

gangagyatso4364 commented 6 months ago

split the deparment csv file into different parts if the file name id format differs.