Closed bryanwong17 closed 2 months ago
Hi @bryanwong17,
Throughout the development, I have checked the quality of prototypes, and they had seemed to make sense to me.
Right now, I can think of two possible reasons
Best,
Hi @andrewsong90,
Thank you for your insights. I was wondering whether you have uploaded the code for inference or cluster assignment. As far as I know, only the code for training K-means (using both sklearn and FAISS K-means) and saving the weights has been uploaded, but not the code for inference or cluster assignments.
If I may, I have a few follow-up questions:
num_proto_patches
100,000 or 1,000,000? I saw in the README
it was 1,000,000 while in the clustering.sh
it was 100,000For Sklearn:
from scipy.spatial.distance import cdist
cluster_labels = kmeans.predict(patch_features) # patch features in one test WSI
centroids = kmeans.cluster_centers_
distances = cdist(patch_features, centroids, metric=args.distance_metric) # distance_metric: 'cosine' or 'euclidean'
distances = np.min(distances, axis=1)
For FAISS:
distances, assignments = index.search(patch_features.numpy(), 1) # patch features in one test WSI
distances, cluster_labels = distances.ravel(), assignments.ravel()
Then, in the inference loop:
for clusternum in range(args.num_proto):
cluster_indices = np.where(cluster_labels == clusternum)[0]
cluster_distances = distances[cluster_indices]
representative_idx = np.argmin(cluster_distances) # index of the representative prototype in that cluster
Thank you very much for your support!
Best,
Hi @andrewsong90,
I also have one question about k-means.
In your settings, L2 distance is used for k-means, while most Foundation models (e.g., UNI) are DINOv2 pre-trained. Intuitively, using cosine distance is a more reasonable choice. I have tried using Gigapath as the encoder to run PANTHER. Here is one example of the prototypical assignment map. Better performance (both qualitatively and quantitatively) is achieved using cosine distance.
BTW, there is still one question since L2 distance is used in GMM.
Hi @bryanwong17,
I have not uploaded the code for inference or cluster assignment, but it should be easy to code up (Your codes for sklearn/FAISS/inference look correct to me)
As for answers to some of your questions
I think it is a design choice and the answer is to try both as @HHHedo is kindly explaining with an example above - While classical clustering approaches would advocate for normalization and centering, I opted for not normalizing for two reasons.
I do not think PCA is necessary.
Sampling same number of patches was just out of convenience, but if you can yes probably use all patches!
Probably the larger the better, but as you might have observed already, beyond certain point the impact on the downstream task was minimal
Prototypes are initialized for each dataset. But I would love to see how pan-cancer prototype initialization helps. Make sure to increase the number of prototypes in this case, since different cancers would have non-overlapping prototypes.
Thank you for helping me improve PANTHER further!
Hi @HHHedo
Thank you very much for providing an illustrative example. As I explained above, the cosine similarity did cross my mind at some point, in accordance with the "right way" of doing clustering. But it is interesting/surpising that feature normalization indeed helps boost the performance further.
Maybe I should update the code base to also include the L2 normalization step before everything - The GMM can still operate based on the L2-normalized feature space!
Thank you so much
Hi @andrewsong90,
Thank you for the detailed explanations. They really help me understand more about your work, especially in clustering part!
Thank you for the great work! I was wondering if you have tried checking the quality of the assignment clusters using FAISS K-Means. I attempted a similar approach to your code by training FAISS K-Means with the extracted features (train data) on the C16 dataset and clustering them into 16 prototypes (using other hyperparameters similar to yours). Then, I used the trained K-Means model to cluster the patches per WSI (test data). However, the clustering results showed poor performance in the test data and even in the training data (e.g., prototype 1 in WSI A looks very different from prototype 1 in WSI B), even though I used a good feature extractor, such as CONCH.