Open kasnerz opened 3 days ago
Plan for the new structure:
/campaigns
: subdirectories for all the campaigns (unifying annotations
and generations
)/data/inputs
: datasets, i.e. what is currently under /data
/data/outputs
: model outputs, i.e. what is currently under outputs
/datasets
: what is now called loaders
(if we don't run into clashes with Hugginface datasets
package)Moreover, we will simplify the directory structure with model outputs. The directory will be most likely divided by default into subdirectories <dataset>/<split>
, but what counts is the (data, split, setup_id, example_idx) tuple the JSONL record.
Current ideas:
datasets.yml
from/config
to/data
so that it's closely tied to the local datasets/outputs
to/data/outputs
/data
to/data/inputs
/<dataset_id>/<split>/<setup_id>/files
)