Open aaprasad opened 4 months ago
Using sleap-io
(docs):
import sleap_io as sio
# Load source labels.
labels = sio.load_file("labels.v001.slp")
# Make splits and export with embedded images.
labels.make_training_splits(n_train=0.8, n_val=0.1, n_test=0.1, save_dir="split1", seed=42)
# Splits will be saved as self-contained SLP package files with images and labels.
labels_train = sio.load_file("split1/train.pkg.slp")
labels_val = sio.load_file("split1/val.pkg.slp")
labels_test = sio.load_file("split1/test.pkg.slp")
Caveats:
One implementation for a higher order data loader would be one that creates a set of sub-clips/segments that are contiguous (maybe with a tolerance for short gaps?).
Basically we want to loop over all labeled frames within Labels
and find connected components of frames that are consecutive in time (optionally with a tolerance for gaps of few frames), belong to the same video, and have instances.
Then, the data loader could break up long clips into sub-samples, randomize across these, and natively handle both multi-video (#70), as well as train/val/test splitting.
Right now we require users to specify the training, and validation videos. it would be nice to just have to specify a pool of videos and have
dreem-train
automatically divide up the the chunks into training and validation