there is now a new template and conventions for the whole dataset repository, we now have discrepancies between the requested gold conventions as requested for the understandability of the data.
Thus, all 5 currently existing projects may need to be redone by:
When redoing, these are the two major conventions that are to be conformed to:
A. Time format - should be displayed/stored as ISO Time format. This can be achieved a few ways, it can be saved as hh:mm:ss.mmm or saved as two integers of seconds and milliseconds to be reconverted into a more readable hh:mm:ss.mmm.
B. Column Headers/Fields - of the golds data should use conventional names, such as start and end instead of start_time and end_time. The chyrons gold data is an example where this is needed. Other column fields should be investigated for similarities.
Done when
[x] Investigate all existing golds datasets for similar fields to rename into one convention.
[x] Slates - update process.py for time & field_names, regenerate golds data
Because
there is now a new template and conventions for the whole dataset repository, we now have discrepancies between the requested gold conventions as requested for the understandability of the data. Thus, all 5 currently existing projects may need to be redone by:
When redoing, these are the two major conventions that are to be conformed to:
A. Time format - should be displayed/stored as ISO Time format. This can be achieved a few ways, it can be saved as
hh:mm:ss.mmm
or saved astwo integers of seconds and milliseconds
to be reconverted into a more readablehh:mm:ss.mmm
.B. Column Headers/Fields - of the golds data should use conventional names, such as
start
andend
instead ofstart_time
andend_time
. The chyrons gold data is an example where this is needed. Other column fields should be investigated for similarities.Done when
[x] Investigate all existing golds datasets for similar fields to rename into one convention.
[x] Slates - update process.py for time & field_names, regenerate golds data
[x] Chyrons - as above
[x] NamedEntity - as above
[x] NamedEntityWiki - as above
[x] Transcript - as above
[ ] Inform downstream tasks/teams where needed.
Additional context
No response