OpenPecha / stt-combine-datasets

MIT License
0 stars 0 forks source link

STT0008: Recreate the benchmark with more metadata from catalogs and more QC data (4) #1

Open spsither opened 4 months ago

spsither commented 4 months ago

Description

Since the last creation of the benchmark dataset, we have more QC data and a few more Departments in STT. With the additional metadata from the Catalogs, we should create a new Benchmark dataset.

Completion Criteria

Post the new benchmark dataset on HF and make sure it has an even distribution across departments and categories with departments.


Implementation Plan

Subtasks