djimenezsanchez / NaroNet

Trained only with subject-level labels, NaroNet discovers phenotypes, neighborhoods, and areas with the highest influence when classifying subject types.
GNU Affero General Public License v3.0
15 stars 7 forks source link

Empty cluster to phenotype arrays #7

Open nnnkaiser opened 7 months ago

nnnkaiser commented 7 months ago

Hi,

I am interested in applying NaroNet to a multiplexed immunofluorescence imaging dataset. I like the idea of your self-supervised embeddings and spatial neighborhood graphs. However, I cannot get the code to function completely. I am going to describe what I think works for me and where the problem starts:

The patch-contrastive pre-training seems to work:

image

And the patch-level embeddings seem to make sense since if I just apply a simple k-means clustering it resembles some expected structures in my images:

image

The NaroNet training seems to work as well and it performs quite well in distinguishing the two groups that I have in the cross-validation confusion matrix:

image

The BioInsights module also provides some possibly reasonable output for areas and neighborhoods:

image

However, then the Phenotype composition of neighborhoods is already empty for all neighborhoods:

image

I can confirm that the respective entries in the previously saved cluster assignment numpy arrays seems to be empty since in line 489 of the Pheno_Neigh_Info.py yields and array that consists only zeros: patch_to_pheno_assignment = np.load(osp.join(dataset.processed_dir_cell_types,'cluster_assignmentPerPatch_Index_{}_0_ClustLvl_{}.npy'.format(idxclster[1], clusters[-3]))) So when I do patch_to_pheno_assignment.max() I get 0.

And then the code crashes at

image

because the respective entries in CropConf are just an empty list.

Do you have any ideas on what to look into or what things or intermediate results I could check to see where things go wrong?

Any help is much appreciated. Thank you!

djimenezsanchez commented 7 months ago

Hello,

Thanks very much for your thorough analysis and explanation.

This is a problem that I've seen before when executing NaroNet. The error that you are seeing at BioInsights stems from NaroNet's inability to classify patches into neighborhoods. This problem arises when neuron activations fail to properly classify patches, resulting in empty neighborhood vectors due to gradient explosions.

You are seeing a fair prediction performance as NaroNet is using other phenotypes or areas to classify subjects into types.

You have two possible courses of action. Firstly, you can work utilizing the areas and phenotypes that NaroNet is successfully classifying. Alternatively, you can attempt to execute NaroNet with a different number of neighborhoods, which may prevent the occurrence of gradient explosions. This is one of the reasons we implemented an architecture search strategy—to mitigate the occurrence of gradient explosions.

I hope this helps! It really looks that you are into something!