Closed rgeirhos closed 2 years ago
Hi @rgeirhos,
Thank you for using VISSL !
Indeed, you are right. The file has been renamed to cluster_features_resnet_8gpu_imagenet
.
To best perform ClusterFit (avoid inefficiency of several feature extraction), I advise you to follow the steps here:
python tools/run_distributed_engines.py
config=feature_extraction/extract_resnet_in1k_8gpu
+config/feature_extraction/trunk_only=rn50_layers
config.EXTRACT_FEATURES.CHUNK_THRESHOLD=100000
config.MODEL.WEIGHTS_INIT.PARAMS_FILE=/path/to/model_weights.torch
res5
and then res4
)python tools/cluster_features_and_label.py
config=pretrain/clusterfit/cluster_features_resnet_8gpu_imagenet
config.CLUSTERFIT.FEATURES.LAYER_NAME=res5
config.CLUSTERFIT.FEATURES.PATH='/path/to/extracted/features/'
config.CLUSTERFIT.FEATURES.DATASET_NAME=imagenet
config.CLUSTERFIT.NUM_CLUSTERS=16000
config.CLUSTERFIT.FEATURES.DATA_PARTITION=TRAIN
python tools/cluster_features_and_label.py
config=pretrain/clusterfit/cluster_features_resnet_8gpu_imagenet
config.CLUSTERFIT.FEATURES.LAYER_NAME=res4
config.CLUSTERFIT.FEATURES.PATH='/path/to/extracted/features/'
config.CLUSTERFIT.FEATURES.DATASET_NAME=imagenet
config.CLUSTERFIT.NUM_CLUSTERS=16000
config.CLUSTERFIT.FEATURES.DATA_PARTITION=TRAIN
The main advantage of doing so is that you can vary the clustering parameters (and create multiple cluster sets for instance) out of a single extraction of features, which will go much faster.
FYI @iseessel: for the next release, we need to update the docs to reflect the renaming and the best guideline
Thank you, Quentin
Hi @QuentinDuval, thanks for your fast & helpful reply!
From my side, please feel free to either close this issue or leave it open until the docs are updated :)
What's CHUNCK_THRESHOLD?
📚 VISSL Documentation
In the documentation for training ClusterFit (https://vissl.readthedocs.io/en/v0.1.5/ssl_approaches/clusterfit.html?highlight=clustering#how-to-train-clusterfit-model), step 1 ("Extract features") requires the following config file: config=pretrain/clusterfit/cluster_features_resnet_8gpu_rotation_in1k. However, there is no such file at https://github.com/facebookresearch/vissl/tree/main/configs/config/pretrain/clusterfit. This might require either adding the file, or updating the documentation in case the file got renamed.