clamsproject / aapb-annotations

Repository to store manual annotation dataset developed for CLAMS-AAPB collaboration
3 stars 0 forks source link

reformat chyron gold files #20

Closed keighrim closed 11 months ago

keighrim commented 1 year ago

Because

The newshour-chyron project'sprocess.py files from https://github.com/clamsproject/aapb-annotations/commit/190987973246f8b043576b7449624720b24f7e85 is generating the gold file as a single csv file, but since the commit we change how we structure this repository, hence the script needs to be completely re-done. Specifically, as stated in the repository README file, we want one file per one media in the gold data. Namely, process.py needs to read all the tabular files from in the YYMMDD-batchname directories (currently there's only one, namely annotations/220701-batch2) and generate one file per GUID.

For data format of the future gold files is a subject to discuss.

Done when

Additional context

No response