mahmoodlab / UNI

Towards a general-purpose foundation model for computational pathology - Nature Medicine
Other
351 stars 48 forks source link

What is the reasoning behind using ImageNet normalization constansts? #2

Closed GeorgeBatch closed 8 months ago

GeorgeBatch commented 8 months ago

Dear authors,

Thank you for releasing this work! I think it will bring great value to the community.

In the Nature paper you say "All pretrained encoders use ImageNet mean and standard deviation parameters for image normalization (including UNI)". The code examples are also consistent with it.

Can you please clarify the reason for sticking to the ImageNet normalization constants? As far as I understand, they were computed on the original ImageNet dataset. Since you had a different dataset for pre-training the UNI model, why not calculate the constants on your dataset?

Many thanks, George

Richarizardd commented 8 months ago

Hi @GeorgeBatch - when developing UNI, we also tried pathology-specific mean and standardization parameters. We did not find a meaningful difference in performance in linear probing and evaluation when using the mean/standard deviation normalization for the Mass-1K dataset. As long as all patch features are extracted using the same norm, the impact on downstream evaluation should be minimal.

As the other baselines in this study such as CTransPath also used ImageNet norm, we opted for ImageNet for ease-of-use and simplicity.

GeorgeBatch commented 8 months ago

Thank you for your explanation!