kreshuklab / plant-seg

A tool for cell instance aware segmentation in densely packed 3D volumetric images
https://kreshuklab.github.io/plant-seg/
MIT License
99 stars 31 forks source link

Question regarding one repeated volume across splits #345

Closed anwai98 closed 1 month ago

anwai98 commented 1 month ago

Hi team,

I would like to report something I recently spotted in the PlantSeg (Root) dataset. It seems like there's a volume (Movie1_t00045_crop_gt.h5) which exists in both train and test splits. I double checked and looks like both the volumes are identical.

cc: @constantinpape

lorenzocerrone commented 1 month ago

Hi @anwai98,

Thanks for spotting it!

Luckily the issue affects only the OSF dataset and not the trained models (you can see the actual train/val/test split used here https://zenodo.org/records/7765026)

@wolny, I think you are the only one with the right to edit the OSF. Could you check if you can update the project?

anwai98 commented 1 month ago

Hi @lorenzocerrone,

Thanks for the quick response.

Re: OSF dataset: Yes, that's right. I do see the volumes repeated across split while fetching the data from OSF.

In case it helps, I can recommend removing the aforementioned volume from the train split, as the train split still has a good number of volumes after removing the mentioned volume.

lorenzocerrone commented 1 month ago

Yes, I agree the dataset should match the one with used in the plublication.

Thanks again from spotting it!

wolny commented 1 month ago

hi @anwai98, thanks for reporting. As already mentioned by Lorenzo, Movie1_t00045_crop_gt.h5 was not used for training (see: https://zenodo.org/records/7765026/files/config_train.yml). I've removed the file from the train split