Open hermancollin opened 2 months ago
We should hardcode the data split in a JSON file inside this repo. It should look like this:
{
train: [
"sub-rat1_sample-[...].png",
"sub-rat2_sample-[...].png",
[...],
],
val: [
[...]
],
test: [
[...]
]
}
In the preprocessing file, the data is first converted to YOLO format, then COCO. The problem is that the data is shuffled at both steps, resulting in random non-matching data splits. Both splits should be identical, otherwise how could we compare both methods???
There should be a JSON or YML file with a hardcoded data split. Both
preprocess_data_coco()
andpreprocess_data_yolo()
should split the data identically.