Promblems with diabetic_retinopathy dataset in VTAB-1k

BenediktAlkin / vtab1k-pytorch

fine-tune models on the VTAB-1K benchmark in pytorch

MIT License

1 stars 0 forks source link

Promblems with diabetic_retinopathy dataset in VTAB-1k #2

Closed WeiQijie closed 2 months ago

WeiQijie commented 2 months ago

As far as I know, the test set of diabetic_retinopathy contains 53576 images, why you have only 42670 images here? Additionally, the processed images in diabetic_retinopathy are very different from the original ones (much more like gray scale images), what is the reason of doing this?

BenediktAlkin commented 2 months ago

We use a fully preprocessed dataset that is standard in parameter efficient finetuning benchmarks. Therefore, we dont conduct any preprocessing.

The number 42670 refers to the number of test samples as each dataset in the VTAB-1K benchmark has exactly 800 train samples and 200 validation samples.

WeiQijie commented 2 months ago

We use a fully preprocessed

Sorry but I'm a little bit confused. The raw dataset of diabetic_retinopathy (https://www.kaggle.com/c/diabetic-retinopathy-detection/data ) have 35126 train images and 53576 test images. From my point of view, the number of test samples should keep exactly the same as the raw dataset. Why we get only 42670 test samples here?

BenediktAlkin commented 2 months ago

As I said, we use a fully preprocessed dataset and do not make any changes w.r.t. dataset splits or anything like that, so I really cant tell you the specifics of how they preprocessed it. If you want to know the specifics: https://github.com/ZhangYuanhan-AI/NOAH/#data-preparation this should be the original source how it was preprocessed.

WeiQijie commented 2 months ago

As I said, we use a fully preprocessed dataset and do not make any changes w.r.t. dataset splits or anything like that, so I really cant tell you the specifics of how they preprocessed it. If you want to know the specifics: https://github.com/ZhangYuanhan-AI/NOAH/#data-preparation this should be the original source how it was preprocessed.

Thanks a lot. I will take a look.