05. Model for Stain4 - Githubissues

EchteRobert commented 2 years ago

To test the generalization of the model trained on Stain3 (and tested on Stain2), I will now evaluate it on Stain4. Based on the results, further advancements will be made by training on plates of Stain4 (and then evaluating on Stain2 and Stain3 in turn).

Stain 4 consists of 30 plates which were divided into 5 batches, each with different staining conditions.

Baseline staining conditions used in Stain3 (Stain2)
Baseline staining conditions used in Stain3 with 2-fold dilution of all dyes (Stain2_2)
Baseline staining conditions used in Stain3 with 2-fold dilution of Phalloidin (Stain2_Phalloidin_2)
Baseline staining conditions used in Stain3 with 2-fold dilution of Phalloidin and ConA (Stain2_Phalloidin_ConA_2)
Bray et al. staining conditions (Bray)

Apart from that, standard exposure vs. high exposure and Binning 1 vs. Binning 2 comparisons were also made.

To analyze the relations between the different plates in Stain4, I calculated the correlation between the PC1 loadings of the mean aggregated profiles of every plate. I only included the plates that were similar enough to form a large cluster.

Click here for clusters!

Click here for cells per well per plate!

EchteRobert commented 2 years ago

Benchmark Stain4

plate	Training mAP BM	Validation mAP BM	PR BM
BR00116630	0.31	0.31	53.30
BR00116625	0.31	0.29	58.90
BR00116631	0.30	0.28	57.80
BR00116627	0.30	0.29	56.70
BR00116630highexp	0.29	0.30	58.90
BR00116629highexp	0.29	0.29	52.20
BR00116627highexp	0.31	0.27	56.70
BR00116628highexp	0.32	0.31	57.80
BR00116625highexp	0.32	0.28	61.10
BR00116631highexp	0.28	0.30	53.30
BR00116628	0.32	0.29	58.90
BR00116629	0.30	0.29	52.20

EchteRobert commented 2 years ago

First model trained on Stain4

Using the same setup as found in Stain3 (https://github.com/broadinstitute/FeatureAggregation_single_cell/issues/6#issuecomment-1095241531) I trained on plates BR00116625highexp, BR00116628highexp_FS, and BR00116629highexp. I only have 2 validation plates now, but will have more next week.

Main takeaways

I am getting similar results to the best model I was able to train on Stain3: the model is discerning training compounds just fine, but is lacking on validation compounds.

One possible explanation that I think may also apply to the Stain3 model is that the training plates look too much alike. The Stain2 model was trained on slightly more dissimilar plates and generalized well to everything from Stain2. However, we have also seen that the model does not generalize to plates that are too different either. Although this might be a possible solution, I think managing this trade-off (i.e. trying different compositions of training plates) is not something I should be looking into because ideally the training plates should play a smaller role in generalization.

Next up

Possible solutions include:

Using a rank based loss function which would penalize the model more for scoring poorly on mAP
Using hard positive/negative mining during contrastive training (although this should already implicitly happen with SupConLoss?) --> this is indeed already happening as the cosine similarity normalizes the representations
Increasing batch size even further (?) --> did not work see https://github.com/broadinstitute/FeatureAggregation_single_cell/issues/6#issuecomment-1109028644

TableTime!

| plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116625highexp | **0.67** | 0.32 | **0.33** | 0.28 | 98.9 | 61.1 | | BR00116628highexp | **0.7** | 0.32 | 0.29 | **0.31** | 97.8 | 57.8 | | BR00116629highexp | **0.65** | 0.29 | **0.35** | 0.29 | 98.9 | 52.2 | | _Validation plates_ | | | | | | | | BR00116630highexp | **0.46** | 0.29 | 0.29 | **0.3** | 94.4 | 58.9 | | BR00116631highexp | **0.39** | 0.28 | 0.23 | **0.3** | 86.7 | 53.3 |

EchteRobert commented 2 years ago

Second model trained on Stain4

With slightly updated parameters according to https://github.com/broadinstitute/FeatureAggregation_single_cell/issues/6#issuecomment-1109028644 I now train on the same plates as before, but evaluate on all plates in the cluster.

Main takeaways

The model is able to generalize to most plates, although there are still 3 for which it does not beat the baseline validation mAP.
Although the 3 outliers are closely linked, they do not especially stand out when comparing them to the training plates using the PC1 loadings correlation plot. One possible improvement could now be to find a method that more accurately describes which plates are close to the training plates and which ones aren't.

plate	Training mAP model	Training mAP BM	Validation mAP model	Validation mAP BM	PR model	PR BM
Training plates
BR00116625highexp	0.75	0.32	0.36	0.28	98.9	61.1
BR00116628highexp	0.76	0.32	0.34	0.31	96.7	57.8
BR00116629highexp	0.75	0.29	0.32	0.29	98.9	52.2
Validation plates
BR00116625	0.55	0.31	0.31	0.29	98.9	58.9
BR00116630highexp	0.47	0.29	0.27	0.3	91.1	58.9
BR00116631highexp	0.42	0.28	0.22	0.3	88.9	53.3
BR00116631	0.42	0.3	0.21	0.28	94.4	57.8
BR00116627highexp	0.5	0.31	0.36	0.27	92.2	56.7
BR00116627	0.48	0.3	0.32	0.29	92.2	56.7
BR00116629	0.55	0.3	0.3	0.29	97.8	52.2
BR00116628	0.56	0.32	0.29	0.29	97.8	58.9

EchteRobert commented 2 years ago

Using a rank based loss function

As described in Deep Metric Learning to Rank , I use the FastAP loss function which optimizes the rank-based Average Precision measure, using an approximation derived from distance quantization. Hypothesis: by directly optimizing the mean Average Precision, instead of the Percent Replicating with the Supervised Contrastive Loss function, the model should generalize better to the ranking task (mAP).

Main takeaways

By optimizing for mean average precision we lose some performance on the Percent Replicating score, as expected. The distribution of the Percent Replicating histograms looks different as well.
I observe no significant increase in performance by training the model with the rank loss function.

Results

Table Results

| plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116625highexp | **0.66** | 0.32 | **0.36** | 0.28 | 95.6 | 61.1 | | BR00116628highexp | **0.67** | 0.32 | **0.33** | 0.31 | 92.2 | 57.8 | | BR00116629highexp | **0.68** | 0.29 | **0.3** | 0.29 | 86.7 | 52.2 | | _Validation plates_ | | | | | | | | BR00116631highexp | **0.39** | 0.28 | 0.26 | **0.3** | 71.1 | 53.3 | | BR00116625 | **0.51** | 0.31 | **0.3** | 0.29 | 86.7 | 58.9 | | BR00116630highexp | **0.44** | 0.29 | 0.29 | **0.3** | 76.7 | 58.9 | | BR00116631 | **0.41** | 0.3 | 0.25 | **0.28** | 77.8 | 57.8 | | BR00116627highexp | **0.48** | 0.31 | **0.35** | 0.27 | 81.1 | 56.7 | | BR00116627 | **0.46** | 0.3 | **0.3** | 0.29 | 73.3 | 56.7 | | BR00116629 | **0.51** | 0.3 | **0.31** | 0.29 | 87.8 | 52.2 | | BR00116628 | **0.48** | 0.32 | 0.24 | **0.29** | 83.3 | 58.9 |

Percent Replicating graphs

_Percent Replicating with rank based loss function_ ![Stain4_BR00116625_PR](https://user-images.githubusercontent.com/62173977/165623534-97f9ec29-48d4-4c3b-822e-22ab468189c4.png) _Percent Replicating with supervised contrastive loss function_ ![Stain4_BR00116625_PR](https://user-images.githubusercontent.com/62173977/165623518-89bf362f-70a1-4853-a485-00525896495b.png)

carpenter-singh-lab / 2024_vanDijk_PLoS_CytoSummaryNet

05. Model for Stain4 #8

Benchmark Stain4

First model trained on Stain4

Main takeaways

Next up

Second model trained on Stain4

Main takeaways

Using a rank based loss function

Main takeaways

Results