IndoNLP / nusa-crowd

A collaborative project to collect datasets in Indonesian languages.
Apache License 2.0
261 stars 61 forks source link

CoVoST2: Fix error in DatasetInfo task_template #288

Closed jensan-1 closed 1 year ago

jensan-1 commented 1 year ago

In response to https://github.com/IndoNLP/nusa-crowd/pull/233#issuecomment-1253813035 on CoVoST2 (merged to master), I have decided to remove the said error line due to the changes in the datasets version and test-run it in my local branch.

@SamuelCahyawijaya @bryanwilie @holylovenia, please check the PR.

jensan-1 commented 1 year ago

@holylovenia I am attempting to replace the audio_file_path_column with audio_column mentioned in the column above. However, one schema (t2t) cannot be implemented with this task_templates (no audio column). Would it be better to remove nusantara_t2t schema (which is derivable from the dataset but not the main task), or leave the task_templates out?

holylovenia commented 1 year ago

@jen-santoso Oh I see now. In that case, could the returned DatasetInfo go inside the if-else paradigm?

jensan-1 commented 1 year ago

/test dataset=covost2 subset_id=covost2_ind_eng

github-actions[bot] commented 1 year ago

Run result

Check test log here: https://github.com/IndoNLP/nusa-crowd/actions/runs/3107658091