carpenter-singh-lab / 2024_vanDijk_PLoS_CytoSummaryNet

1 stars 1 forks source link

05. Model for Stain4 #8

Open EchteRobert opened 2 years ago

EchteRobert commented 2 years ago

To test the generalization of the model trained on Stain3 (and tested on Stain2), I will now evaluate it on Stain4. Based on the results, further advancements will be made by training on plates of Stain4 (and then evaluating on Stain2 and Stain3 in turn).

Stain 4 consists of 30 plates which were divided into 5 batches, each with different staining conditions.

Apart from that, standard exposure vs. high exposure and Binning 1 vs. Binning 2 comparisons were also made.

To analyze the relations between the different plates in Stain4, I calculated the correlation between the PC1 loadings of the mean aggregated profiles of every plate. I only included the plates that were similar enough to form a large cluster.

Click here for clusters!
Click here for cells per well per plate!
EchteRobert commented 2 years ago

Benchmark Stain4

plate Training mAP BM Validation mAP BM PR BM
BR00116630 0.31 0.31 53.30
BR00116625 0.31 0.29 58.90
BR00116631 0.30 0.28 57.80
BR00116627 0.30 0.29 56.70
BR00116630highexp 0.29 0.30 58.90
BR00116629highexp 0.29 0.29 52.20
BR00116627highexp 0.31 0.27 56.70
BR00116628highexp 0.32 0.31 57.80
BR00116625highexp 0.32 0.28 61.10
BR00116631highexp 0.28 0.30 53.30
BR00116628 0.32 0.29 58.90
BR00116629 0.30 0.29 52.20
EchteRobert commented 2 years ago

First model trained on Stain4

Using the same setup as found in Stain3 (https://github.com/broadinstitute/FeatureAggregation_single_cell/issues/6#issuecomment-1095241531) I trained on plates BR00116625highexp, BR00116628highexp_FS, and BR00116629highexp. I only have 2 validation plates now, but will have more next week.

Main takeaways

One possible explanation that I think may also apply to the Stain3 model is that the training plates look too much alike. The Stain2 model was trained on slightly more dissimilar plates and generalized well to everything from Stain2. However, we have also seen that the model does not generalize to plates that are too different either. Although this might be a possible solution, I think managing this trade-off (i.e. trying different compositions of training plates) is not something I should be looking into because ideally the training plates should play a smaller role in generalization.

Next up

Possible solutions include:

TableTime! | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116625highexp | **0.67** | 0.32 | **0.33** | 0.28 | 98.9 | 61.1 | | BR00116628highexp | **0.7** | 0.32 | 0.29 | **0.31** | 97.8 | 57.8 | | BR00116629highexp | **0.65** | 0.29 | **0.35** | 0.29 | 98.9 | 52.2 | | _Validation plates_ | | | | | | | | BR00116630highexp | **0.46** | 0.29 | 0.29 | **0.3** | 94.4 | 58.9 | | BR00116631highexp | **0.39** | 0.28 | 0.23 | **0.3** | 86.7 | 53.3 |
EchteRobert commented 2 years ago

Second model trained on Stain4

With slightly updated parameters according to https://github.com/broadinstitute/FeatureAggregation_single_cell/issues/6#issuecomment-1109028644 I now train on the same plates as before, but evaluate on all plates in the cluster.

Main takeaways

plate Training mAP model Training mAP BM Validation mAP model Validation mAP BM PR model PR BM
Training plates
BR00116625highexp 0.75 0.32 0.36 0.28 98.9 61.1
BR00116628highexp 0.76 0.32 0.34 0.31 96.7 57.8
BR00116629highexp 0.75 0.29 0.32 0.29 98.9 52.2
Validation plates
BR00116625 0.55 0.31 0.31 0.29 98.9 58.9
BR00116630highexp 0.47 0.29 0.27 0.3 91.1 58.9
BR00116631highexp 0.42 0.28 0.22 0.3 88.9 53.3
BR00116631 0.42 0.3 0.21 0.28 94.4 57.8
BR00116627highexp 0.5 0.31 0.36 0.27 92.2 56.7
BR00116627 0.48 0.3 0.32 0.29 92.2 56.7
BR00116629 0.55 0.3 0.3 0.29 97.8 52.2
BR00116628 0.56 0.32 0.29 0.29 97.8 58.9
EchteRobert commented 2 years ago

Using a rank based loss function

As described in Deep Metric Learning to Rank , I use the FastAP loss function which optimizes the rank-based Average Precision measure, using an approximation derived from distance quantization. Hypothesis: by directly optimizing the mean Average Precision, instead of the Percent Replicating with the Supervised Contrastive Loss function, the model should generalize better to the ranking task (mAP).

Main takeaways

Results

Table Results | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116625highexp | **0.66** | 0.32 | **0.36** | 0.28 | 95.6 | 61.1 | | BR00116628highexp | **0.67** | 0.32 | **0.33** | 0.31 | 92.2 | 57.8 | | BR00116629highexp | **0.68** | 0.29 | **0.3** | 0.29 | 86.7 | 52.2 | | _Validation plates_ | | | | | | | | BR00116631highexp | **0.39** | 0.28 | 0.26 | **0.3** | 71.1 | 53.3 | | BR00116625 | **0.51** | 0.31 | **0.3** | 0.29 | 86.7 | 58.9 | | BR00116630highexp | **0.44** | 0.29 | 0.29 | **0.3** | 76.7 | 58.9 | | BR00116631 | **0.41** | 0.3 | 0.25 | **0.28** | 77.8 | 57.8 | | BR00116627highexp | **0.48** | 0.31 | **0.35** | 0.27 | 81.1 | 56.7 | | BR00116627 | **0.46** | 0.3 | **0.3** | 0.29 | 73.3 | 56.7 | | BR00116629 | **0.51** | 0.3 | **0.31** | 0.29 | 87.8 | 52.2 | | BR00116628 | **0.48** | 0.32 | 0.24 | **0.29** | 83.3 | 58.9 |
Percent Replicating graphs _Percent Replicating with rank based loss function_ ![Stain4_BR00116625_PR](https://user-images.githubusercontent.com/62173977/165623534-97f9ec29-48d4-4c3b-822e-22ab468189c4.png) _Percent Replicating with supervised contrastive loss function_ ![Stain4_BR00116625_PR](https://user-images.githubusercontent.com/62173977/165623518-89bf362f-70a1-4853-a485-00525896495b.png)