Support different crop sizes based on dataset provided in config

Currently, we can only specify a single crop size with which to make bounding boxes around the instance. However, for mixed dataset training, each dataset can have a different optimal crop size depending on imaging resolution. Proposing changes to allow a different crop size for each dataset specified in the train_dir argument in configs.

Current (params.yaml):

dataset: train_dataset: dir: path: [ "/mustafa/microscopy-data/source-1/train", "/mustafa/microscopy-data/source-2/train", "/mustafa/microscopy-data/source-3/train" ] labels_suffix: ".slp" vid_suffix: ".mp4" clip_length: 32 crop_size: 32

Proposed (params.yaml):

Requires changes to:

datasets/sleap_dataset.py - since each img that is processed can be traced back to its training path, the associated crop size can also be extracted.
Type hints wherever crop_size is passed in (sleap_dataset.py, base_dataset.py)

talmolab / dreem

Support different crop sizes based on dataset provided in config #98