carpenter-singh-lab / 2024_vanDijk_PLoS_CytoSummaryNet

0 stars 0 forks source link

06. Final model iterations #9

Open EchteRobert opened 2 years ago

EchteRobert commented 2 years ago

Two cluster training data (T: S3+S4)

Some final tweaks to training the model will be made in this issue. All of these tweaks will be made with Stain2, Stain3, and Stain4 in mind at the same time, in stead of 1 at a time. The first model is trained on 3 plates from Stain3 and Stain4 at the same time and evaluated on Stain2, Stain3, and Stain4.

Main takeaways

Table Stain4 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116625highexp | **0.74** | 0.32 | **0.36** | 0.28 | 98.9 | 61.1 | | BR00116628highexp | **0.73** | 0.32 | **0.32** | 0.31 | 98.9 | 57.8 | | BR00116629highexp | **0.78** | 0.29 | **0.35** | 0.29 | 100 | 52.2 | | _Validation plates_ | | | | | | | | BR00116631highexp | **0.47** | 0.28 | 0.27 | **0.3** | 93.3 | 53.3 | | BR00116625 | **0.6** | 0.31 | **0.35** | 0.29 | 98.9 | 58.9 | | BR00116630highexp | **0.52** | 0.29 | **0.3** | 0.3 | 97.8 | 58.9 | | BR00116631 | **0.5** | 0.3 | 0.26 | **0.28** | 94.4 | 57.8 | | BR00116627highexp | **0.55** | 0.31 | **0.38** | 0.27 | 98.9 | 56.7 | | BR00116627 | **0.55** | 0.3 | **0.36** | 0.29 | 96.7 | 56.7 | | BR00116629 | **0.61** | 0.3 | **0.32** | 0.29 | 98.9 | 52.2 | | BR00116628 | **0.58** | 0.32 | 0.28 | **0.29** | 98.9 | 58.9 |
Table Stain3 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00115134 | **0.75** | 0.37 | **0.42** | 0.33 | 98.9 | 58.9 | | BR00115125 | **0.75** | 0.36 | **0.44** | 0.29 | 98.9 | 54.4 | | BR00115133highexp | **0.76** | 0.38 | **0.38** | 0.31 | 97.8 | 60 | | _Validation plates_ | | | | | | | | BR00115128highexp | **0.52** | 0.4 | **0.42** | 0.33 | 97.8 | 58.9 | | BR00115125highexp | **0.58** | 0.37 | **0.41** | 0.31 | 98.9 | 55.6 | | BR00115131 | **0.54** | 0.38 | **0.44** | 0.29 | 98.9 | 58.9 | | BR00115126 | **0.34** | 0.32 | **0.33** | 0.28 | 57.8 | 53.3 | | BR00115133 | **0.58** | 0.38 | **0.4** | 0.3 | 96.7 | 62.2 | | BR00115127 | **0.56** | 0.38 | **0.47** | 0.31 | 98.9 | 58.9 | | BR00115128 | **0.53** | 0.39 | **0.42** | 0.32 | 96.7 | 61.1 | | BR00115129 | **0.57** | 0.38 | **0.45** | 0.32 | 98.9 | 52.2 |
Table Stain2 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:-------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | BR00112202 | **0.43** | 0.34 | **0.38** | 0.3 | 88.9 | 54.4 | | BR00112197standard | **0.45** | 0.4 | **0.41** | 0.28 | 85.6 | 56.7 | | BR00112198 | **0.43** | 0.35 | **0.4** | 0.3 | 91.1 | 56.7 | | BR00112197repeat | **0.43** | 0.41 | **0.37** | 0.31 | 81.1 | 63.3 | | BR00112204 | **0.4** | 0.35 | **0.46** | 0.29 | 82.2 | 58.9 | | BR00112197binned | **0.43** | 0.41 | **0.39** | 0.3 | 86.7 | 58.9 | | BR00112201 | **0.47** | 0.4 | **0.41** | 0.32 | 91.1 | 66.7 |
EchteRobert commented 2 years ago

Two cluster training data (T: S2+S4)

This model is trained on 3 plates from Stain2 and Stain4 at the same time and evaluated on Stain2, Stain3, and Stain4.

Main takeaways

Training on Stain2 and Stain4 yields similar results to the previous model: it still generalizes to Stain3. However, one of the plates outside of the Stain3 cluster (BR00115126) did not perform as well, showing that there are still some plate effects that are being learned.

Table Stain4 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116625highexp | **0.84** | 0.32 | **0.38** | 0.28 | 98.9 | 61.1 | | BR00116628highexp | **0.83** | 0.32 | **0.34** | 0.31 | 100 | 57.8 | | BR00116629highexp | **0.83** | 0.29 | **0.32** | 0.29 | 98.9 | 52.2 | | _Validation plates_ | | | | | | | | BR00116631highexp | **0.49** | 0.28 | 0.28 | **0.3** | 92.2 | 53.3 | | BR00116625 | **0.62** | 0.31 | **0.35** | 0.29 | 98.9 | 58.9 | | BR00116630highexp | **0.54** | 0.29 | **0.33** | 0.3 | 92.2 | 58.9 | | BR00116631 | **0.51** | 0.3 | 0.26 | **0.28** | 94.4 | 57.8 | | BR00116627highexp | **0.54** | 0.31 | **0.37** | 0.27 | 97.8 | 56.7 | | BR00116627 | **0.54** | 0.3 | **0.35** | 0.29 | 97.8 | 56.7 | | BR00116629 | **0.61** | 0.3 | **0.35** | 0.29 | 98.9 | 52.2 | | BR00116628 | **0.62** | 0.32 | **0.31** | 0.29 | 97.8 | 58.9 |
Table Stain3 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | BR00115128highexp | **0.5** | 0.4 | **0.46** | 0.33 | 98.9 | 58.9 | | BR00115125highexp | **0.42** | 0.37 | **0.33** | 0.31 | 86.7 | 55.6 | | BR00115134 | **0.47** | 0.37 | **0.36** | 0.33 | 87.8 | 58.9 | | BR00115125 | **0.43** | 0.36 | **0.33** | 0.29 | 85.6 | 54.4 | | BR00115131 | **0.48** | 0.38 | **0.46** | 0.29 | 93.3 | 58.9 | | BR00115133 | **0.43** | 0.38 | **0.32** | 0.3 | 83.3 | 62.2 | | BR00115127 | **0.5** | 0.38 | **0.43** | 0.31 | 94.4 | 58.9 | | BR00115133highexp | **0.45** | 0.38 | **0.38** | 0.31 | 88.9 | 60 | | BR00115128 | **0.5** | 0.39 | **0.47** | 0.32 | 94.4 | 61.1 | | BR00115129 | **0.49** | 0.38 | **0.45** | 0.32 | 97.8 | 52.2 | | BR00115126 | 0.3 | **0.32** | **0.29** | 0.28 | 48.9 | **53.3** |
Table Stain2 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:-------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00112201 | **0.8** | 0.4 | **0.58** | 0.32 | 100 | 66.7 | | BR00112198 | **0.77** | 0.35 | **0.55** | 0.3 | 100 | 56.7 | | BR00112204 | **0.8** | 0.35 | **0.53** | 0.29 | 100 | 58.9 | | _Validation plates_ | | | | | | | | BR00112202 | **0.59** | 0.34 | **0.49** | 0.3 | 100 | 54.4 | | BR00112197standard | **0.58** | 0.4 | **0.49** | 0.28 | 97.8 | 56.7 | | BR00112197binned | **0.49** | 0.41 | **0.43** | 0.3 | 87.8 | 58.9 | | BR00112197repeat | **0.57** | 0.41 | **0.5** | 0.31 | 95.6 | 63.3 |
EchteRobert commented 2 years ago

Two cluster training data (T: S2+S3)

This model is trained on 3 plates from Stain2 and Stain3 at the same time and evaluated on Stain2, Stain3, and Stain4.

Main takeaways

Table Stain4 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | BR00116631highexp | **0.29** | 0.28 | 0.17 | **0.3** | 68.9 | 53.3 | | BR00116625highexp | **0.37** | 0.32 | 0.26 | **0.28** | 76.7 | 61.1 | | BR00116628highexp | **0.34** | 0.32 | 0.22 | **0.31** | 80 | 57.8 | | BR00116625 | **0.36** | 0.31 | 0.27 | **0.29** | 76.7 | 58.9 | | BR00116630highexp | **0.36** | 0.29 | 0.23 | **0.3** | 78.9 | 58.9 | | BR00116631 | **0.32** | 0.3 | 0.17 | **0.28** | 65.6 | 57.8 | | BR00116629highexp | **0.36** | 0.29 | 0.26 | **0.29** | 81.1 | 52.2 | | BR00116627highexp | **0.36** | 0.31 | 0.25 | **0.27** | 78.9 | 56.7 | | BR00116627 | **0.35** | 0.3 | 0.26 | **0.29** | 75.6 | 56.7 | | BR00116629 | **0.36** | 0.3 | 0.21 | **0.29** | 74.4 | 52.2 | | BR00116628 | **0.33** | 0.32 | 0.19 | **0.29** | 72.2 | 58.9 |
Table Stain3 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00115134 | **0.79** | 0.37 | **0.4** | 0.33 | 98.9 | 58.9 | | BR00115125 | **0.73** | 0.36 | **0.42** | 0.29 | 98.9 | 54.4 | BR00115133highexp | **0.8** | 0.38 | **0.37** | 0.31 | 97.8 | 60 | | _Validation plates_ | | | | | | | | BR00115131 | **0.54** | 0.38 | **0.43** | 0.29 | 97.8 | 58.9 | | BR00115126 | **0.34** | 0.32 | **0.3** | 0.28 | 57.8 | 53.3 | | BR00115133 | **0.56** | 0.38 | **0.33** | 0.3 | 97.8 | 62.2 | | BR00115127 | **0.58** | 0.38 | **0.45** | 0.31 | 96.7 | 58.9 | | BR00115128 | **0.53** | 0.39 | **0.49** | 0.32 | 97.8 | 61.1 | | BR00115129 | **0.55** | 0.38 | **0.45** | 0.32 | 98.9 | 52.2 | | BR00115128highexp | **0.52** | 0.4 | **0.46** | 0.33 | 100 | 58.9 | | BR00115125highexp | **0.54** | 0.37 | **0.33** | 0.31 | 95.6 | 55.6 |
Table Stain2 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:-------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00112198 | **0.74** | 0.35 | **0.54** | 0.3 | 100 | 56.7 | | BR00112204 | **0.75** | 0.35 | **0.51** | 0.29 | 100 | 58.9 | | BR00112201 | **0.75** | 0.4 | **0.51** | 0.32 | 100 | 66.7 | | _Validation plates_ | | | | | | | | BR00112202 | **0.55** | 0.34 | **0.42** | 0.3 | 97.8 | 54.4 | | BR00112197standard | **0.59** | 0.4 | **0.5** | 0.28 | 95.6 | 56.7 | | BR00112197repeat | **0.58** | 0.41 | **0.55** | 0.31 | 96.7 | 63.3 | | BR00112197binned | **0.55** | 0.41 | **0.51** | 0.3 | 93.3 | 58.9 |
EchteRobert commented 2 years ago

Three cluster training data (6 plates)

This model is trained on 2 plates from Stain2, Stain3, and Stain4 and evaluated on all the remaining plates within their clusters.

Main takeaways

Table Stain4 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116625highexp | **0.82** | 0.32 | **0.36** | 0.28 | 97.8 | 61.1 | | BR00116628highexp | **0.85** | 0.32 | **0.3** | 0.31 | 98.9 | 57.8 | | _Validation plates_ | | | | | | | | BR00116625 | **0.59** | 0.31 | **0.31** | 0.29 | 96.7 | 58.9 | | BR00116630highexp | **0.48** | 0.29 | 0.28 | **0.3** | 91.1 | 58.9 | | BR00116629highexp | **0.5** | 0.29 | **0.31** | 0.29 | 95.6 | 52.2 | | BR00116627highexp | **0.54** | 0.31 | **0.36** | 0.27 | 97.8 | 56.7 | | BR00116627 | **0.51** | 0.3 | **0.34** | 0.29 | 95.6 | 56.7 | | BR00116629 | **0.49** | 0.3 | **0.31** | 0.29 | 94.4 | 52.2 | | BR00116628 | **0.55** | 0.32 | 0.24 | **0.29** | 96.7 | 58.9 | | BR00116631highexp | **0.41** | 0.28 | 0.24 | **0.3** | 86.7 | 53.3 | | BR00116631 | **0.45** | 0.3 | 0.24 | **0.28** | 93.3 | 57.8 |
Table Stain3 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00115134 | **0.88** | 0.37 | **0.42** | 0.33 | 98.9 | 58.9 | | BR00115125 | **0.83** | 0.36 | **0.43** | 0.29 | 98.9 | 54.4 | | _Validation plates_ | | | | | | | | BR00115128highexp | **0.53** | 0.4 | **0.45** | 0.33 | 94.4 | 58.9 | | BR00115125highexp | **0.57** | 0.37 | **0.36** | 0.31 | 98.9 | 55.6 | | BR00115131 | **0.53** | 0.38 | **0.44** | 0.29 | 97.8 | 58.9 | | BR00115126 | **0.35** | 0.32 | **0.34** | 0.28 | 64.4 | 53.3 | | BR00115133 | **0.45** | 0.38 | **0.32** | 0.3 | 82.2 | 62.2 | | BR00115127 | **0.57** | 0.38 | **0.47** | 0.31 | 96.7 | 58.9 | | BR00115133highexp | **0.46** | 0.38 | 0.3 | **0.31** | 87.8 | 60 | | BR00115128 | **0.54** | 0.39 | **0.42** | 0.32 | 95.6 | 61.1 | | BR00115129 | **0.56** | 0.38 | **0.44** | 0.32 | 94.4 | 52.2 |
Table Stain2 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:-------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00112198 | **0.82** | 0.35 | **0.5** | 0.3 | 100 | 56.7 | | BR00112201 | **0.82** | 0.4 | **0.53** | 0.32 | 100 | 66.7 | | _Validation plates_ | | | | | | | | BR00112202 | **0.55** | 0.34 | **0.46** | 0.3 | 97.8 | 54.4 | | BR00112197standard | **0.57** | 0.4 | **0.45** | 0.28 | 95.6 | 56.7 | | BR00112197repeat | **0.57** | 0.41 | **0.49** | 0.31 | 94.4 | 63.3 | | BR00112204 | **0.55** | 0.35 | **0.48** | 0.29 | 98.9 | 58.9 | | BR00112197binned | **0.52** | 0.41 | **0.45** | 0.3 | 93.3 | 58.9 |
EchteRobert commented 2 years ago

Three cluster training data (9 plates)

This model is trained on 3 plates from Stain2, Stain3, and Stain4 and evaluated on all the remaining plates within their clusters.

Main takeaways

For a complete discussion of all trained models, see the comment below.

Table Stain4 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116625highexp | **0.71** | 0.32 | **0.37** | 0.28 | 98.9 | 61.1 | | BR00116628highexp | **0.72** | 0.32 | **0.35** | 0.31 | 98.9 | 57.8 | | BR00116629highexp | **0.7** | 0.29 | **0.34** | 0.29 | 98.9 | 52.2 | | _Validation plates_ | | | | | | | | BR00116625 | **0.58** | 0.31 | **0.37** | 0.29 | 96.7 | 58.9 | | BR00116630highexp | **0.53** | 0.29 | **0.32** | 0.3 | 97.8 | 58.9 | | BR00116627highexp | **0.54** | 0.31 | **0.37** | 0.27 | 97.8 | 56.7 | | BR00116627 | **0.53** | 0.3 | **0.34** | 0.29 | 96.7 | 56.7 | | BR00116629 | **0.57** | 0.3 | **0.33** | 0.29 | 97.8 | 52.2 | | BR00116628 | **0.57** | 0.32 | **0.3** | 0.29 | 97.8 | 58.9 | | BR00116631highexp | **0.45** | 0.28 | 0.26 | **0.3** | 92.2 | 53.3 | | BR00116631 | **0.48** | 0.3 | 0.26 | **0.28** | 95.6 | 57.8 |
Table Stain3 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00115134 | **0.73** | 0.37 | **0.44** | 0.33 | 98.9 | 58.9 | | BR00115125 | **0.69** | 0.36 | **0.44** | 0.29 | 98.9 | 54.4 | | BR00115133highexp | **0.72** | 0.38 | **0.41** | 0.31 | 100 | 60 | | _Validation plates_ | | | | | | | | BR00115128highexp | **0.58** | 0.4 | **0.49** | 0.33 | 100 | 58.9 | | BR00115125highexp | **0.58** | 0.37 | **0.38** | 0.31 | 98.9 | 55.6 | | BR00115131 | **0.56** | 0.38 | **0.5** | 0.29 | 98.9 | 58.9 | | BR00115126 | **0.33** | 0.32 | **0.32** | 0.28 | 57.8 | 53.3 | | BR00115133 | **0.56** | 0.38 | **0.39** | 0.3 | 97.8 | 62.2 | | BR00115127 | **0.59** | 0.38 | **0.49** | 0.31 | 98.9 | 58.9 | | BR00115128 | **0.57** | 0.39 | **0.53** | 0.32 | 100 | 61.1 | | BR00115129 | **0.58** | 0.38 | **0.5** | 0.32 | 100 | 52.2 |
Table Stain2 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:-------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00112204 | **0.69** | 0.35 | **0.54** | 0.29 | 100 | 58.9 | | BR00112201 | **0.72** | 0.4 | **0.52** | 0.32 | 100 | 66.7 | | BR00112198 | **0.68** | 0.35 | **0.52** | 0.3 | 100 | 56.7 | | _Validation plates_ | | | | | | | | BR00112197repeat | **0.58** | 0.41 | **0.52** | 0.31 | 97.8 | 63.3 | | BR00112197binned | **0.53** | 0.41 | **0.48** | 0.3 | 93.3 | 58.9 | | BR00112202 | **0.57** | 0.34 | **0.48** | 0.3 | 98.9 | 54.4 | | BR00112197standard | **0.6** | 0.4 | **0.47** | 0.28 | 96.7 | 56.7 |
EchteRobert commented 2 years ago

Model cross analysis

Here I compare all trained models described in the previous comments.

Main takeaways

Average rank across metrics
S3+S4 2.92
S2+S4 2.83
S2+S3 3.67
S2+S3+S4 (6 plates) 3.58
S2+S3+S4 (9plates) 1.92
Individual cluster 4.92
Stain2 validation mAP evaluation | **Model name** | **Mean** | **Median** | **Min** | **Max** | **Mean rank** | **Median rank** | **Min rank** | **Max rank** | |----------------------|----------|------------|---------|---------|---------------|-----------------|--------------|--------------| | _S3+S4_ | 0.40 | 0.40 | 0.37 | 0.46 | 6.00 | 6.00 | 5.00 | 6.00 | | **S2+S4** | 0.51 | 0.50 | 0.43 | 0.58 | 1.00 | 3.00 | 3.00 | 1.00 | | S2+S3 | 0.51 | 0.51 | 0.42 | 0.55 | 2.00 | 2.00 | 4.00 | 2.00 | | S2+S3+S4 (6 plates) | 0.48 | 0.48 | 0.45 | 0.53 | 4.00 | 4.00 | 2.00 | 4.00 | | **S2+S3+S4 (9plates)** | 0.50 | 0.52 | 0.47 | 0.54 | 3.00 | 1.00 | 1.00 | 3.00 | | S2 | 0.44 | 0.45 | 0.37 | 0.49 | 5.00 | 5.00 | 5.00 | 5.00 |
Stain3 validation mAP evaluation | **Model name** | **Mean** | **Median** | **Min** | **Max** | **Mean rank** | **Median rank** | **Min rank** | **Max rank** | |---------------------|----------|------------|---------|---------|---------------|-----------------|--------------|--------------| | S3+S4 | 0.42 | 0.42 | 0.33 | 0.47 | 2.00 | 2.00 | 1.00 | 3.00 | | S2+S4 | 0.39 | 0.38 | 0.29 | 0.47 | 5.00 | 5.00 | 5.00 | 3.00 | | S2+S3 | 0.40 | 0.42 | 0.30 | 0.49 | 3.00 | 2.00 | 3.00 | 2.00 | | S2+S3+S4 (6 plates) | 0.40 | 0.42 | 0.30 | 0.47 | 4.00 | 2.00 | 3.00 | 3.00 | | **S2+S3+S4 (9plates)** | 0.45 | 0.44 | 0.32 | 0.53 | 1.00 | 1.00 | 2.00 | 1.00 | | _S3_ | 0.37 | 0.38 | 0.29 | 0.44 | 6.00 | 5.00 | 5.00 | 6.00 |
Stain4 validation mAP evaluation | Model name | Mean | Median | Min | Max | Mean rank | Median rank | Min rank | Max rank | |---------------------|------|--------|------|------|-----------|-------------|----------|----------| | **S3+S4** | 0.46 | 0.45 | 0.38 | 0.52 | 1.00 | 1.00 | 1.00 | 1.00 | | S2+S4 | 0.33 | 0.34 | 0.26 | 0.38 | 2.00 | 2.00 | 2.00 | 2.00 | | _S2+S3_ | 0.23 | 0.23 | 0.17 | 0.27 | 6.00 | 6.00 | 6.00 | 6.00 | | S2+S3+S4 (6 plates) | 0.30 | 0.31 | 0.24 | 0.36 | 5.00 | 4.00 | 4.00 | 4.00 | | S2+S3+S4 (9plates) | 0.33 | 0.34 | 0.26 | 0.37 | 3.00 | 2.00 | 2.00 | 3.00 | | S4 | 0.30 | 0.31 | 0.21 | 0.36 | 4.00 | 4.00 | 5.00 | 4.00 |
Rank order analysis | Model name | Average mean rank | Average median rank | Average min rank | Average max rank | |---------------------|-------------------|---------------------|------------------|------------------| | S3+S4 | 3.00 | 3.00 | 2.33 | 3.33 | | S2+S4 | 2.67 | 3.33 | 3.33 | 2.00 | | S2+S3 | 3.67 | 3.33 | 4.33 | 3.33 | | S2+S3+S4 (6 plates) | 4.33 | 3.33 | 3.00 | 3.67 | | **S2+S3+S4 (9plates)** | 2.33 | 1.33 | 1.67 | 2.33 | | Individual cluster | 5.00 | 4.67 | 5.00 | 5.00 |
Extra plate analysis (2 from Stain2 and 1 from Stain4) _S2+S3+S4 (9plates)_ | plate | Training mAP model | Validation mAP model | PR model | |:---------------|---------------------:|-----------------------:|-----------:| | BR00116634bin1 | 0.34 | 0.18 | 71.1 | | BR00113818 | 0.44 | 0.31 | 87.8 | | BR00113820 | 0.39 | 0.34 | 78.9 | _S2+S3+S4 (6 plates)_ | plate | Training mAP model | Validation mAP model | PR model | |:---------------|---------------------:|-----------------------:|-----------:| | BR00116634bin1 | 0.3 | 0.2 | 71.1 | | BR00113818 | 0.4 | 0.29 | 90 | | BR00113820 | 0.36 | 0.33 | 77.8 | _S2+S3_ | plate | Training mAP model | Validation mAP model | PR model | |:---------------|---------------------:|-----------------------:|-----------:| | BR00116634bin1 | 0.26 | 0.13 | 66.7 | | BR00113818 | 0.42 | 0.29 | 87.8 | | BR00113820 | 0.35 | 0.31 | 74.4 | _S2+S4_ | plate | Training mAP model | Validation mAP model | PR model | |:---------------|---------------------:|-----------------------:|-----------:| | BR00116634bin1 | 0.35 | 0.19 | 70 | | BR00113818 | 0.4 | 0.28 | 82.2 | | BR00113820 | 0.35 | 0.29 | 67.8 | _S3+S4_ | plate | Training mAP model | Validation mAP model | PR model | |:---------------|---------------------:|-----------------------:|-----------:| | BR00116634bin1 | 0.32 | 0.17 | 67.8 | | BR00113818 | 0.36 | 0.25 | 81.1 | | BR00113820 | 0.33 | 0.31 | 81.1 |
EchteRobert commented 2 years ago

Three cluster training data (12 plates)

As a final test, to see if increasing the number of training plates increases performance on validation compounds and plates, I train a model with 4 plates from Stain2, Stain3, and Stain4.

Main takeaways

Adding this model to the rank analysis from the previous comment, we see that indeed increasing the number of plates increases the average validation mAP. Although there is a bias as the number of plates that serve as training data increase and their validation mAP is also used for these calculations. It's even starting to generalize to the outlier plates in Stain4.

Average mean rank Average median rank Average min rank Average max rank Average
S3+S4 3.67 3.67 2.67 4.00 3.50
S2+S4 3.67 4.33 4.33 2.67 3.75
S2+S3 4.67 4.33 5.33 4.00 4.58
S2+S3+S4 (6 plates) 5.33 4.33 3.67 4.67 4.50
S2+S3+S4 (9plates) 3.33 2.00 2.00 3.00 2.58
Individual cluster 6.00 5.67 6.00 6.00 5.92
S2+S3+S4 (12 plates) 1.33 1.33 2.00 2.00 1.67
Table Stain4 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116628 | **0.81** | 0.32 | **0.33** | 0.29 | 98.9 | 58.9 | | BR00116625highexp | **0.76** | 0.32 | **0.38** | 0.28 | 98.9 | 61.1 | | BR00116628highexp | **0.8** | 0.32 | **0.38** | 0.31 | 98.9 | 57.8 | | BR00116629highexp | **0.74** | 0.29 | **0.39** | 0.29 | 100 | 52.2 | | _Validation plates_ | | | | | | | | BR00116625 | **0.63** | 0.31 | **0.4** | 0.29 | 98.9 | 58.9 | | BR00116630highexp | **0.56** | 0.29 | **0.34** | 0.3 | 96.7 | 58.9 | | BR00116627highexp | **0.58** | 0.31 | **0.38** | 0.27 | 96.7 | 56.7 | | BR00116627 | **0.56** | 0.3 | **0.38** | 0.29 | 97.8 | 56.7 | | BR00116629 | **0.64** | 0.3 | **0.36** | 0.29 | 100 | 52.2 | | BR00116631highexp | **0.5** | 0.28 | 0.27 | **0.3** | 91.1 | 53.3 | | BR00116631 | **0.52** | 0.3 | **0.29** | 0.28 | 94.4 | 57.8 |
Table Stain3 | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00115134 | **0.75** | 0.37 | **0.46** | 0.33 | 98.9 | 58.9 | | BR00115125 | **0.68** | 0.36 | **0.48** | 0.29 | 98.9 | 54.4 | | BR00115133highexp | **0.79** | 0.38 | **0.41** | 0.31 | 97.8 | 60 | | BR00115133 | **0.79** | 0.38 | **0.43** | 0.3 | 97.8 | 62.2 | | _Validation plates_ | | | | | | | | BR00115131 | **0.58** | 0.38 | **0.48** | 0.29 | 97.8 | 58.9 | | BR00115126 | **0.36** | 0.32 | **0.32** | 0.28 | 58.9 | 53.3 | | BR00115127 | **0.6** | 0.38 | **0.51** | 0.31 | 98.9 | 58.9 | | BR00115128 | **0.57** | 0.39 | **0.54** | 0.32 | 100 | 61.1 | | BR00115129 | **0.59** | 0.38 | **0.5** | 0.32 | 98.9 | 52.2 | | BR00115128highexp | **0.57** | 0.4 | **0.49** | 0.33 | 98.9 | 58.9 | | BR00115125highexp | **0.55** | 0.37 | **0.41** | 0.31 | 98.9 | 55.6 |
Table Stain2 | | Average mean rank | Average median rank | Average min rank | Average max rank | | | |----------------------|-------------------|---------------------|------------------|--------------------|---|------| | S3+S4 | 4.67 | 4.67 | 3.00 | 4.33 | | 4.17 | | S2+S4 | 3.33 | 4.00 | 4.00 | 2.33 | | 3.42 | | S2+S3 | 4.67 | 4.33 | 5.33 | 4.00 | | 4.58 | | S2+S3+S4 (6 plates) | 5.33 | 4.33 | 3.67 | 4.67 | | 4.50 | | S2+S3+S4 (9plates) | 3.00 | 1.67 | 1.67 | 3.00 | | 2.33 | | Individual cluster | 6.00 | 5.67 | 6.00 | 6.00 | | 5.92 | | **S2+S3+S4 (12 plates)** | 1.00 | 1.00 | 1.67 | 1.67 | | **1.33** |
Number of training plates versus mean validation mAP _We see some saturation in validation mAP for Stain2 and Stain3, which reinforce the higher validation mAP I have been getting for these datasets. Stain4 can still improve which is also in line with what I have been observing: Stain4 seems to be a more difficult dataset. What _difficult_ means exactly remains to be answered._ _The errorbars in the plot indicate the minimum and maximum validation mAP for a plate observed, so not very outlier proof._ ![NrTrainingPlatesVSvalidationmAP](https://user-images.githubusercontent.com/62173977/166962095-3ce525e1-de7a-415c-b8ce-14c52e93e1a0.png)
EchteRobert commented 2 years ago

Training plate influence

To test the influence of which training plates are used on model generalization, I switched up all the training plates and added 3 outlier plates (according to the PC1 loading correlations) as well. I then trained the model in the same way as previous models. Note that comparing the performance of the models is now even harder as the validation plates are completely different.

Main takeaway

It appears that, as long as enough training plates are used (i.e. at least 12 here), the model is able to learn a general method of aggregation for different types of analysis pipelines, no matter what training plates are used. Although I do think that using plates from different Stains (which differ quite a lot in terms of feature importances) is beneficial to generalization.

Results

Stain2 table | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:-------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00112202 | **0.7** | 0.34 | **0.56** | 0.3 | 100 | 54.4 | | BR00112197binned | **0.73** | 0.41 | **0.56** | 0.3 | 98.9 | 58.9 | | _Validation plates_ | | | | | | | | BR00112197standard | **0.65** | 0.4 | **0.53** | 0.28 | 97.8 | 56.7 | | BR00112198 | **0.63** | 0.35 | **0.54** | 0.3 | 100 | 56.7 | | BR00112197repeat | **0.61** | 0.41 | **0.53** | 0.31 | 98.9 | 63.3 | | BR00112204 | **0.62** | 0.35 | **0.53** | 0.29 | 98.9 | 58.9 | | BR00112201 | **0.68** | 0.4 | **0.54** | 0.32 | 100 | 66.7 |
Stain3 table | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00115128 | **0.69** | 0.39 | **0.56** | 0.32 | 100 | 61.1 | | BR00115125highexp | **0.68** | 0.37 | **0.41** | 0.31 | 98.9 | 55.6 | | BR00115133highexp | **0.75** | 0.38 | **0.47** | 0.31 | 98.9 | 60 | | BR00115131 | **0.68** | 0.38 | **0.54** | 0.29 | 100 | 58.9 | | _Validation plates_ | | | | | | | | BR00115128highexp | **0.64** | 0.4 | **0.59** | 0.33 | 98.9 | 58.9 | | BR00115134 | **0.62** | 0.37 | **0.48** | 0.33 | 97.8 | 58.9 | | BR00115125 | **0.61** | 0.36 | **0.46** | 0.29 | 100 | 54.4 | | BR00115126 | **0.38** | 0.32 | **0.36** | 0.28 | 68.9 | 53.3 | | BR00115133 | **0.65** | 0.38 | **0.42** | 0.3 | 97.8 | 62.2 | | BR00115127 | **0.63** | 0.38 | **0.52** | 0.31 | 100 | 58.9 | | BR00115129 | **0.59** | 0.38 | **0.55** | 0.32 | 100 | 52.2 |
Stain4 table | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:------------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116631 | **0.65** | 0.3 | **0.32** | 0.28 | 96.7 | 57.8 | | BR00116627 | **0.69** | 0.3 | **0.39** | 0.29 | 97.8 | 56.7 | | BR00116630highexp | **0.69** | 0.29 | **0.4** | 0.3 | 96.7 | 58.9 | | _Validation plates_ | | | | | | | | BR00116631highexp | **0.6** | 0.28 | **0.3** | 0.3 | 95.6 | 53.3 | | BR00116625highexp | **0.61** | 0.32 | **0.42** | 0.28 | 98.9 | 61.1 | | BR00116628highexp | **0.64** | 0.32 | **0.37** | 0.31 | 97.8 | 57.8 | | BR00116625 | **0.58** | 0.31 | **0.39** | 0.29 | 97.8 | 58.9 | | BR00116629highexp | **0.64** | 0.29 | **0.41** | 0.29 | 97.8 | 52.2 | | BR00116627highexp | **0.62** | 0.31 | **0.48** | 0.27 | 98.9 | 56.7 | | BR00116629 | **0.62** | 0.3 | **0.37** | 0.29 | 97.8 | 52.2 | | BR00116628 | **0.62** | 0.32 | **0.31** | 0.29 | 96.7 | 58.9 |
Outlier plates table | plate | Training mAP model | Training mAP BM | Validation mAP model | Validation mAP BM | PR model | PR BM | |:---------------|---------------------:|------------------:|-----------------------:|--------------------:|-----------:|--------:| | _Training plates_ | | | | | | | | BR00116634bin1 | **0.59** | 0.24 | **0.31** | 0.18 | 96.7 | 53.3 | | BR00113818 | **0.69** | 0.28 | **0.45** | 0.29 | 96.7 | 52.2 | | BR00113820 | **0.67** | 0.3 | **0.45** | 0.3 | 96.7 | 55.6 |
EchteRobert commented 2 years ago

Aggregated profile UMAP analysis

Now that the model is getting consistent results on Stain2, Stain3, and Stain4, I want to do some qualitative analyses to investigate what the model is learning and what it outputs. First up is UMAPs of the model aggregated well profiles of the validation compounds for Stain2, Stain3, and Stain4.

Main takeaways

The model ignores batch effects for strong signal compounds and clusters them nicely. Mean aggregation also performs decent clustering for strong signal compounds while ignoring plate effects, however the clusters are much less separated than the model clusters.

UMAPs Stain2 _Mean aggregation_ ![Screen Shot 2022-05-11 at 3 52 31 PM](https://user-images.githubusercontent.com/62173977/167934120-c90cbb4f-eda0-4692-b579-f1383c7dbf3a.png) ![Screen Shot 2022-05-11 at 3 52 43 PM](https://user-images.githubusercontent.com/62173977/167934151-ca65819a-939e-426b-b8c5-1494be2a4436.png) _Model aggregation_ ![Screen Shot 2022-05-11 at 3 52 59 PM](https://user-images.githubusercontent.com/62173977/167934192-f9e3f42e-78c4-4b99-b124-9b9d23065255.png) ![Screen Shot 2022-05-11 at 3 53 13 PM](https://user-images.githubusercontent.com/62173977/167934226-b5159a7b-6459-495d-b683-9788cc387f20.png)
UMAPs Stain3 _Mean aggregation_ ![Screen Shot 2022-05-11 at 3 50 39 PM](https://user-images.githubusercontent.com/62173977/167933818-1f3f467e-4aa4-41f7-8541-c14050a2cf1f.png) ![Screen Shot 2022-05-11 at 3 50 51 PM](https://user-images.githubusercontent.com/62173977/167933853-28f08017-86e8-4cff-9afd-b58e99e2377f.png) _Model aggregation_ ![Screen Shot 2022-05-11 at 3 51 06 PM](https://user-images.githubusercontent.com/62173977/167933886-fdb7afed-5e9b-4022-bd16-8809d2dc5c56.png) ![Screen Shot 2022-05-11 at 3 51 17 PM](https://user-images.githubusercontent.com/62173977/167933909-89624194-b9e9-493c-a54c-7ee2d1d32ab6.png)
UMAPs Stain4 _Mean aggregation_ ![Screen Shot 2022-05-11 at 3 48 25 PM](https://user-images.githubusercontent.com/62173977/167933473-0db1f10e-68ac-4eae-83e6-7afde91d42c8.png) ![Screen Shot 2022-05-11 at 3 48 48 PM](https://user-images.githubusercontent.com/62173977/167933522-85b23c26-2c14-445b-af25-5a7e6a84a895.png) _Model aggregation_ ![Screen Shot 2022-05-11 at 3 49 00 PM](https://user-images.githubusercontent.com/62173977/167933561-afc808b5-3a9c-445c-8e89-9bdb4c955a58.png) ![Screen Shot 2022-05-11 at 3 49 11 PM](https://user-images.githubusercontent.com/62173977/167933589-187d0097-9347-40c8-9eec-3de9423ec176.png)
EchteRobert commented 2 years ago

Cell saliency analysis

In continuation of the previous experiment, I visualized the saliency of cells (i.e. the summed gradients over all features with respect to the SupConLoss over all wells. With this visualization I attempt to visualize how the model is selecting certain cells over others. I visualize 3 compounds here that are poorly profiled by the mean (~0.3 mAP), while they are strongly profiled by the model (~0.9 mAP): sirolimus (red), skepinone-l (green), and purmorphamine (viridis). From each compound I take two wells to visualize.

Main takeaways

Next up

Perhaps tracing back these cells to the images will give us more insight into what the model is learning.

Cell saliency visualization _Brighter colours indicate more saliency according to the model; darker colours mean less saliency._ ![Screen Shot 2022-05-11 at 4 02 28 PM](https://user-images.githubusercontent.com/62173977/167936407-c81a71a9-592c-4d9f-96d4-516395b906d6.png) _sirolimus_ ![Screen Shot 2022-05-11 at 4 23 56 PM](https://user-images.githubusercontent.com/62173977/167941331-478d0d79-bd7b-4db5-8838-423931505ad8.png) _skepinone-l_ ![Screen Shot 2022-05-11 at 4 24 07 PM](https://user-images.githubusercontent.com/62173977/167941357-11d3efba-acfc-468e-9065-aae48b19f377.png) _purmorphamine_ ![Screen Shot 2022-05-11 at 4 24 52 PM](https://user-images.githubusercontent.com/62173977/167941468-c9fb2221-d834-4c3c-9f17-d9f1642a7b8e.png)
Saliency threshold _The saliency values are normalized per well and thresholded to be above 0.8_ ![Screen Shot 2022-05-11 at 4 33 42 PM](https://user-images.githubusercontent.com/62173977/167942790-37b17ec7-8e41-4f3b-9c5a-ad7a9e467525.png)
EchteRobert commented 2 years ago

Cell saliencies overlay over complete FOV

Here, I show the raw images of a purmorphamine well (M08) in plate BR00112197binned (Stain2). Stain2 only contains 4 images so the FOV is larger than for the other Stain datasets. I use green and red boxes to denote high (>0.8) and low (<0.2) saliency cells. Perhaps in the future I will find a better way of visualizing these cells as the overlay impedes the visual analysis of those cells. I am only showing one FOV here, but it's split into 4 sections for inspection purposes.

The following was outlined by Mehrtash:

To gain more insight into what the model is doing, it would be very useful to "color" several complete FOVs based on the saliency scores and visually inspect them (to begin with). In the least interesting scenario, I suspect that the model might have learned to become a really good QC filter + mean aggregation over the passing cells -- which is still quite interesting, remarkable, and explains why it generalizes to new compounds. Another possibility is that the model might have further learned to pick divergent morphologies (in relevant directions) from the given bunch, come up with a consensus over those, and output the consensus features.

Main takeaways

It seems like the model is mostly looking at cells that are clearly separated, while giving less attention to cells in very crowded spaces. This can be seen in all four FOVs shown below. These images are taken from only one well and one compound though so I will need to check other wells and plates to see if this trend persists.

Images here! ![image](https://user-images.githubusercontent.com/62173977/170127521-6819810b-f82e-4bb6-9eee-7f6108275294.png) ![image](https://user-images.githubusercontent.com/62173977/170127765-b12fd163-3086-429a-b499-ba2a8806ddf5.png) ![image](https://user-images.githubusercontent.com/62173977/170127785-4ebfc81c-4a6b-4bd1-bdca-cdeac54a509a.png) ![image](https://user-images.githubusercontent.com/62173977/170127804-d22460d3-f362-426e-8988-226c65af6ad9.png)
EchteRobert commented 2 years ago

Admixing Experiment

The following experiment was outlined by Mehrtash:

Here's a useful experiment to gain more insight about what the network is doing: take a large number of cells from the same compound (and across several plates) and classify them according to saliency score into two groups -- high: top 20% in saliency, and low: bottom 20% in saliency; throw the middle away. Now, make synthetic inputs to your network with different admixtures of high and low saliency cells, e.g. 0 high + 500 low, 1 high + 499 low, 2 high + 498 low, ..., 499 high + 1 low. 500 high + 0 low, in a deterministic way (e.g. add one high, remove one low, rinse and repeat). Take a PCA of the network output over these 500 inputs and plot the first few PCs vs. admixing fraction, with 0 meaning 0 high + 500 low, and 1 meaning 500 high + 0 low. If you see a "gating" behavior w.r.t. admixing fraction, i.e. the PCs jumping up sharply after a threshold of high saliency cells and quickly stabilizing, then the network has definitely learned to ignore low saliency cells. The noise of the output further sheds light on what the network is doing to the high saliency cells: if the network is simply averaging high saliency cells, you'd expect ~ 1/\sqrt(N) noise in the network output, where N is the number of high saliency cells in the input. If the network is doing feature learning and gating, you'd see a faster scaling, e.g.. 1/N or faster.

I performed this experiment for multiple saliency cut-offs (5, 10, 20, and 40%) and tried different numbers of cells for the admixtures. I eventually settled on using 1000 cells (instead of the 500 mentioned above). Using more cells simply increases the 'resolution' of the figures by creating more datapoints. Note that for this experiment I am using 4 wells from a single plate (instead of multiple). I calculate the X% most salient cells per well and then merge them in one big pool to sample from during the experiment.

I use three types of saliency: gradient, distance (in loss space), and hold one cell out based saliencies, named V1, V2, and V3 respectively. V1 is considered to be more noisy and this measure does not necessarily point to cells that are the most or least representative of a certain profile. I think it rather points to cells whose features are most influential on creating an aggregated profile that is best positioned in the loss space. The exact definition remains hard to interpret and explain. V2 provides a distance measure of how far each single cell in a set is from the aggregated profile (using all cells in a set). Cells further away are considered less salient and cells close by are considered more salient. V3 computes the profile for a well and iteratively leaves one cell out of the set, until you have N profiles for a given well with N cells. Then the supervised contrastive loss is calculated for each of these profiles with respect to the aggregated profiles of all other wells in the plate. This means it has 3 positive pairs and 380 negative pairs. The profiles for which the loss is higher are given a higher saliency and vice versa.

As a sanity check I also performed this experiment using a cut-off of 100%, i.e. just randomly selecting cells. This last experiment should show no changes as a function of the admixing fraction, because there should be little variance captured in the first few PCs (as all profiles should be more similar).

Main takeaways

Experiment results here (activation layer L1 norm - V0)! ![SaliencyV0_thres05](https://user-images.githubusercontent.com/62173977/172893820-d3b6448c-c750-48c8-aedb-c3c1f306a95e.png) ![SaliencyV0_thres10](https://user-images.githubusercontent.com/62173977/172893824-69336fff-a8bb-4be2-ac56-eba91969ed9e.png) ![SaliencyV0_thres20](https://user-images.githubusercontent.com/62173977/172893825-8b828d91-c16e-44aa-b873-76e228f19d20.png) ![SaliencyV0_thres40](https://user-images.githubusercontent.com/62173977/172893826-7e8efb5a-fcc6-4143-a510-f3a0f4455e9c.png)
Experiment results here (gradient saliency - V1)! ![image](https://user-images.githubusercontent.com/62173977/170350552-dd73def1-5f13-4de4-89aa-7ab2b85ea590.png) ![image](https://user-images.githubusercontent.com/62173977/170350569-1cce4360-1298-4e06-916a-5e46afca1740.png) ![image](https://user-images.githubusercontent.com/62173977/170350577-3f335ec4-4764-4636-ba3a-b1911bb681a9.png) ![image](https://user-images.githubusercontent.com/62173977/170350589-e2919326-0726-4074-9a0e-97e37d91c969.png) ![image](https://user-images.githubusercontent.com/62173977/170350619-8d1d8394-9306-4747-ac72-69b0f43e8fd2.png)
Experiment results here (V0 + V1)! ![SaliencyV0_V1_thres05](https://user-images.githubusercontent.com/62173977/172914064-1bf3d456-4146-4d2a-a300-de125bf8c757.png) ![SaliencyV0_V1_thres10](https://user-images.githubusercontent.com/62173977/172914066-a9fdc143-3e62-4892-a898-eee09d106e57.png) ![SaliencyV0_V1_thres20](https://user-images.githubusercontent.com/62173977/172914069-3ba3cabc-a856-4b64-a227-1312278b22ed.png) ![SaliencyV0_V1_thres40](https://user-images.githubusercontent.com/62173977/172914070-f3cb149b-c537-4789-99e2-cd2b90f8ca92.png)
Experiment results here (distance saliency - V2)! ![image](https://user-images.githubusercontent.com/62173977/170350638-9b8451f6-3c7b-4f32-bd23-7a2adc999489.png) ![image](https://user-images.githubusercontent.com/62173977/170350657-71712c2d-5541-43ab-b338-d14fc452670f.png) ![image](https://user-images.githubusercontent.com/62173977/170350677-c3ab4786-c930-4ef5-bec8-de0c6c5b87d5.png) ![image](https://user-images.githubusercontent.com/62173977/170350702-28c507f0-64bd-4393-ae72-f7cb31a5f41d.png) ![image](https://user-images.githubusercontent.com/62173977/170350721-3eadaaa8-6a25-4248-8309-75152fee90fe.png)
Experiment results here (leave one out saliency - V3)! ![image](https://user-images.githubusercontent.com/62173977/170350748-eea18ed1-ca42-4143-931b-982beab78739.png) ![image](https://user-images.githubusercontent.com/62173977/170350768-d1bb787b-7a97-46f7-8b52-2c52883880ea.png) ![image](https://user-images.githubusercontent.com/62173977/170350787-6bc07b11-dbc9-4901-bc7a-4af8b3bc9551.png) ![image](https://user-images.githubusercontent.com/62173977/170350802-a5025d90-8d78-4d71-822f-d2ce2b327485.png) ![image](https://user-images.githubusercontent.com/62173977/170350821-7474e6e3-e921-4722-8f0e-3a32c60286c2.png)
Updated figure (no random sampling for each fraction) Saliency V0 + V1 ![Screen Shot 2022-06-09 at 2 43 06 PM](https://user-images.githubusercontent.com/62173977/172921121-231f2eef-da75-449b-8ed7-eca8557c011d.png)
EchteRobert commented 2 years ago

Inspecting correlation between saliency and CellProfiler features

_All of the results below are calculated with 'run-20220505221947-1m1zas58' aka the 'Stain234 12 plates outliers' model.

I have updated the saliency based cell image outlines, they now use square boxes instead of coloring the entire cell. I use either V0 (L1 norm of first activation layer) or V1 (L1 norm of the back propagated gradient by SupConLoss) saliency for the image boxes. I calculated the Pearson correlation between the various saliencies and the CellProfiler features of the input cells. The main idea is to figure out what the saliencies indicate. From visual inspection of the full fov's with V1 saliency overlay, we can see that higher saliency cells tend to be isolated while lower saliency cells tend to lie on top of each other or are in a more crowded space. If this is what the model is generally doing, the features corresponding to isolation should be highly correlated with the V1 saliency.

Main takeaways

Conclusion

The model likely gives higher weight to cells which are more isolated, defined by AreaShape, IntegratedIntensity (sum over intensity pixels), and nearest neighbor distances. It also gives more weight to cells with low DNA, RNA, and Mito intensities. In general, these correlations indicate a quality control filter. Isolated cells give better resolution of the cells, while high DNA, RNA and Mito intensities indicate cells that are in the process of cell division.

Full fov's with Saliency V0 overlay ![BR00112197binned_M08_f1c30sV0](https://user-images.githubusercontent.com/62173977/172668013-fea4d178-8f20-47fd-9b9f-716a84f84089.png) ![BR00112197binned_M08_f2c30sV0](https://user-images.githubusercontent.com/62173977/172668015-e3082a09-e324-4e2d-a40c-61a3902dfcde.png) ![BR00112197binned_M08_f3c30sV0](https://user-images.githubusercontent.com/62173977/172668017-a9df9e5e-8756-4ac4-87f1-4e8a5c3218d2.png) ![BR00112197binned_M08_f4c30sV0](https://user-images.githubusercontent.com/62173977/172668018-cf062ba2-36c1-4b0e-a99c-fa834ee31d60.png)
Full fov's with Saliency V1 overlay ![BR00112197binned_M08_f1c30sV1](https://user-images.githubusercontent.com/62173977/172668121-b0b1bc9a-0e65-4fc1-925c-a77bd19dd6e3.png) ![BR00112197binned_M08_f2c30sV1](https://user-images.githubusercontent.com/62173977/172668123-1156e48b-3978-47bf-8e7f-91e620e8ef69.png) ![BR00112197binned_M08_f3c30sV1](https://user-images.githubusercontent.com/62173977/172668124-31318e4a-5b30-49e8-8c9c-7be4c1e83cdc.png) ![BR00112197binned_M08_f4c30sV1](https://user-images.githubusercontent.com/62173977/172668126-55bac078-398d-42f3-9817-5df4f0f627ef.png)
Top20 positive Pearson correlations | Saliency V0 | | | |------------|---------------------------------------------------|-------| | Features | Correlation | | | 535 | Cytoplasm.Cytoplasm_AreaShape_Area | 0.646 | | 874 | Cells.Cells_AreaShape_Area | 0.640 | | 515 | Cells.Cells_AreaShape_MeanRadius | 0.634 | | 484 | Cells.Cells_AreaShape_MedianRadius | 0.632 | | 651 | Cytoplasm.Cytoplasm_Intensity_IntegratedIntens... | 0.628 | | 295 | Cells.Cells_Intensity_IntegratedIntensity_Brig... | 0.621 | | 187 | Cells.Cells_AreaShape_MaximumRadius | 0.619 | | 140 | Cytoplasm.Cytoplasm_Intensity_IntegratedIntens... | 0.603 | | 898 | Cytoplasm.Cytoplasm_Intensity_IntegratedIntens... | 0.592 | | 459 | Cells.Cells_AreaShape_MinorAxisLength | 0.591 | | 311 | Cytoplasm.Cytoplasm_AreaShape_MedianRadius | 0.589 | | 1061 | Cells.Cells_Intensity_IntegratedIntensity_Mito | 0.588 | | 130 | Cytoplasm.Cytoplasm_AreaShape_MinFeretDiameter | 0.587 | | 1048 | Cells.Cells_AreaShape_MinFeretDiameter | 0.587 | | 1094 | Cytoplasm.Cytoplasm_AreaShape_MinorAxisLength | 0.579 | | 1306 | Cells.Cells_AreaShape_Perimeter | 0.571 | | 633 | Cells.Cells_Intensity_IntegratedIntensity_AGP | 0.566 | | 565 | Cytoplasm.Cytoplasm_AreaShape_Perimeter | 0.562 | | 41 | Cytoplasm.Cytoplasm_AreaShape_MeanRadius | 0.561 | | 752 | Cytoplasm.Cytoplasm_Intensity_IntegratedIntens... | 0.558 | | Saliency V1 | | | |------------|---------------------------------------------------|-------| | Features | Correlation | | | 338 | Cytoplasm.Cytoplasm_Correlation_K_DNA_Brightfield | 0.572 | | 770 | Nuclei.Nuclei_AreaShape_MeanRadius | 0.556 | | 515 | Cells.Cells_AreaShape_MeanRadius | 0.555 | | 187 | Cells.Cells_AreaShape_MaximumRadius | 0.553 | | 851 | Nuclei.Nuclei_AreaShape_MedianRadius | 0.542 | | 484 | Cells.Cells_AreaShape_MedianRadius | 0.541 | | 1227 | Nuclei.Nuclei_Correlation_Overlap_DNA_RNA | 0.532 | | 849 | Nuclei.Nuclei_AreaShape_MaximumRadius | 0.528 | | 459 | Cells.Cells_AreaShape_MinorAxisLength | 0.525 | | 1094 | Cytoplasm.Cytoplasm_AreaShape_MinorAxisLength | 0.521 | | 130 | Cytoplasm.Cytoplasm_AreaShape_MinFeretDiameter | 0.510 | | 1048 | Cells.Cells_AreaShape_MinFeretDiameter | 0.510 | | 208 | Cells.Cells_Neighbors_FirstClosestDistance_Adj... | 0.498 | | 874 | Cells.Cells_AreaShape_Area | 0.490 | | 985 | Cells.Cells_Neighbors_SecondClosestDistance_Ad... | 0.488 | | 1134 | Cells.Cells_Correlation_RWC_Brightfield_RNA | 0.488 | | 999 | Cells.Cells_Correlation_RWC_RNA_Brightfield | 0.477 | | 1117 | Nuclei.Nuclei_Correlation_K_ER_Brightfield | 0.470 | | 535 | Cytoplasm.Cytoplasm_AreaShape_Area | 0.468 | | 145 | Nuclei.Nuclei_AreaShape_MinorAxisLength | 0.463 | | Saliency V2 | | | |------------|---------------------------------------------------|-------| | Features | Correlation | | | 1117 | Nuclei.Nuclei_Correlation_K_ER_Brightfield | 0.458 | | 652 | Nuclei.Nuclei_Correlation_K_RNA_Brightfield | 0.433 | | 1227 | Nuclei.Nuclei_Correlation_Overlap_DNA_RNA | 0.411 | | 432 | Cells.Cells_Correlation_K_ER_Brightfield | 0.406 | | 485 | Cells.Cells_Correlation_K_RNA_Brightfield | 0.396 | | 720 | Nuclei.Nuclei_Correlation_Overlap_DNA_ER | 0.380 | | 1206 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | 0.375 | | 1198 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | 0.374 | | 76 | Nuclei.Nuclei_Correlation_K_AGP_Brightfield | 0.371 | | 1233 | Nuclei.Nuclei_Correlation_K_ER_AGP | 0.369 | | 1270 | Cytoplasm.Cytoplasm_Correlation_K_ER_Brightfield | 0.366 | | 408 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | 0.366 | | 913 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | 0.365 | | 949 | Cytoplasm.Cytoplasm_Correlation_K_RNA_Brightfield | 0.364 | | 726 | Nuclei.Nuclei_Correlation_K_Mito_Brightfield | 0.361 | | 384 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | 0.341 | | 255 | Nuclei.Nuclei_Correlation_K_RNA_DNA | 0.339 | | 964 | Nuclei.Nuclei_Granularity_1_Mito | 0.338 | | 1141 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | 0.335 | | 134 | Cells.Cells_Granularity_1_ER | 0.325 | | Saliency V3 | | | |------------|---------------------------------------------------|-------| | Features | Correlation | | | 356 | Cytoplasm.Cytoplasm_Intensity_StdIntensity_Bri... | 0.552 | | 704 | Cells.Cells_Intensity_StdIntensity_Brightfield | 0.485 | | 662 | Cells.Cells_Intensity_StdIntensityEdge_Brightf... | 0.448 | | 30 | Cytoplasm.Cytoplasm_Intensity_StdIntensityEdge... | 0.447 | | 1004 | Cytoplasm.Cytoplasm_RadialDistribution_RadialC... | 0.325 | | 1079 | Cytoplasm.Cytoplasm_Intensity_MaxIntensity_Bri... | 0.304 | | 68 | Cytoplasm.Cytoplasm_RadialDistribution_RadialC... | 0.299 | | 74 | Cytoplasm.Cytoplasm_Intensity_MADIntensity_Bri... | 0.279 | | 1169 | Cells.Cells_Intensity_MaxIntensity_Brightfield | 0.277 | | 982 | Cells.Cells_Correlation_Correlation_AGP_Bright... | 0.272 | | 617 | Nuclei.Nuclei_Intensity_StdIntensityEdge_Brigh... | 0.266 | | 111 | Cytoplasm.Cytoplasm_Correlation_Correlation_AG... | 0.263 | | 294 | Cells.Cells_Granularity_14_Brightfield | 0.263 | | 817 | Cytoplasm.Cytoplasm_Granularity_14_Brightfield | 0.262 | | 1093 | Nuclei.Nuclei_Granularity_14_Brightfield | 0.258 | | 842 | Cytoplasm.Cytoplasm_Granularity_15_Brightfield | 0.252 | | 540 | Cells.Cells_Granularity_15_Brightfield | 0.252 | | 639 | Nuclei.Nuclei_Granularity_15_Brightfield | 0.246 | | 415 | Nuclei.Nuclei_Correlation_K_Mito_ER | 0.241 | | 892 | Cytoplasm.Cytoplasm_Intensity_MaxIntensityEdge... | 0.233 | | Saliency V4 | | | |------------|---------------------------------------------------|-------| | Features | Correlation | | | 1227 | Nuclei.Nuclei_Correlation_Overlap_DNA_RNA | 0.545 | | 1117 | Nuclei.Nuclei_Correlation_K_ER_Brightfield | 0.525 | | 726 | Nuclei.Nuclei_Correlation_K_Mito_Brightfield | 0.511 | | 652 | Nuclei.Nuclei_Correlation_K_RNA_Brightfield | 0.498 | | 338 | Cytoplasm.Cytoplasm_Correlation_K_DNA_Brightfield | 0.489 | | 720 | Nuclei.Nuclei_Correlation_Overlap_DNA_ER | 0.483 | | 485 | Cells.Cells_Correlation_K_RNA_Brightfield | 0.471 | | 1149 | Nuclei.Nuclei_Correlation_Overlap_DNA_Mito | 0.471 | | 964 | Nuclei.Nuclei_Granularity_1_Mito | 0.442 | | 1233 | Nuclei.Nuclei_Correlation_K_ER_AGP | 0.440 | | 999 | Cells.Cells_Correlation_RWC_RNA_Brightfield | 0.437 | | 80 | Cytoplasm.Cytoplasm_Correlation_RWC_Brightfiel... | 0.437 | | 432 | Cells.Cells_Correlation_K_ER_Brightfield | 0.435 | | 949 | Cytoplasm.Cytoplasm_Correlation_K_RNA_Brightfield | 0.433 | | 1134 | Cells.Cells_Correlation_RWC_Brightfield_RNA | 0.432 | | 76 | Nuclei.Nuclei_Correlation_K_AGP_Brightfield | 0.422 | | 616 | Nuclei.Nuclei_AreaShape_Solidity | 0.415 | | 770 | Nuclei.Nuclei_AreaShape_MeanRadius | 0.411 | | 515 | Cells.Cells_AreaShape_MeanRadius | 0.406 | | 991 | Cells.Cells_Correlation_K_AGP_Brightfield | 0.405 |
Top20 negative Pearson correlations | Saliency V0 | | | |-------------|---------------------------------------------------|--------| | Features | Correlation | | | 587 | Cytoplasm.Cytoplasm_Correlation_K_Mito_RNA | -0.620 | | 1040 | Cells.Cells_Correlation_K_Mito_RNA | -0.606 | | 930 | Cytoplasm.Cytoplasm_Correlation_K_Mito_DNA | -0.593 | | 828 | Cytoplasm.Cytoplasm_Intensity_MeanIntensity_DNA | -0.586 | | 72 | Cytoplasm.Cytoplasm_Intensity_UpperQuartileInt... | -0.562 | | 1309 | Cytoplasm.Cytoplasm_Intensity_MeanIntensityEdg... | -0.552 | | 439 | Cells.Cells_Intensity_MeanIntensityEdge_DNA | -0.539 | | 231 | Cells.Cells_Intensity_MinIntensity_DNA | -0.539 | | 918 | Cytoplasm.Cytoplasm_Intensity_MinIntensity_DNA | -0.539 | | 451 | Cytoplasm.Cytoplasm_Intensity_MinIntensityEdge... | -0.534 | | 499 | Cells.Cells_Intensity_MinIntensityEdge_DNA | -0.534 | | 186 | Cells.Cells_Correlation_K_Mito_AGP | -0.531 | | 813 | Cytoplasm.Cytoplasm_Correlation_K_Mito_AGP | -0.524 | | 1243 | Cells.Cells_Intensity_MeanIntensityEdge_RNA | -0.506 | | 438 | Cells.Cells_Intensity_MedianIntensity_DNA | -0.499 | | 52 | Cytoplasm.Cytoplasm_Intensity_LowerQuartileInt... | -0.495 | | 733 | Cytoplasm.Cytoplasm_Intensity_StdIntensity_DNA | -0.493 | | 725 | Cytoplasm.Cytoplasm_Intensity_MedianIntensity_DNA | -0.489 | | 829 | Cells.Cells_Correlation_Overlap_Mito_RNA | -0.488 | | 716 | Cells.Cells_Intensity_MinIntensityEdge_RNA | -0.486 | | Saliency V1 | | | |-------------|---------------------------------------------------|--------| | Features | Correlation | | | 439 | Cells.Cells_Intensity_MeanIntensityEdge_DNA | -0.671 | | 389 | Cells.Cells_Intensity_StdIntensityEdge_DNA | -0.653 | | 1165 | Nuclei.Nuclei_RadialDistribution_RadialCV_DNA_... | -0.648 | | 1309 | Cytoplasm.Cytoplasm_Intensity_MeanIntensityEdg... | -0.636 | | 733 | Cytoplasm.Cytoplasm_Intensity_StdIntensity_DNA | -0.635 | | 81 | Cells.Cells_Intensity_MaxIntensityEdge_RNA | -0.635 | | 289 | Cytoplasm.Cytoplasm_Correlation_K_Brightfield_DNA | -0.629 | | 986 | Cytoplasm.Cytoplasm_Intensity_MeanIntensityEdg... | -0.619 | | 663 | Cells.Cells_Intensity_MaxIntensityEdge_DNA | -0.617 | | 0 | Cells.Cells_Intensity_StdIntensityEdge_RNA | -0.610 | | 87 | Cytoplasm.Cytoplasm_Intensity_MaxIntensityEdge... | -0.608 | | 1070 | Nuclei.Nuclei_Intensity_MaxIntensityEdge_RNA | -0.608 | | 343 | Cytoplasm.Cytoplasm_Intensity_MaxIntensity_RNA | -0.606 | | 449 | Nuclei.Nuclei_Intensity_StdIntensityEdge_DNA | -0.606 | | 1243 | Cells.Cells_Intensity_MeanIntensityEdge_RNA | -0.603 | | 631 | Nuclei.Nuclei_Intensity_MaxIntensityEdge_DNA | -0.596 | | 72 | Cytoplasm.Cytoplasm_Intensity_UpperQuartileInt... | -0.594 | | 828 | Cytoplasm.Cytoplasm_Intensity_MeanIntensity_DNA | -0.588 | | 86 | Nuclei.Nuclei_Intensity_MeanIntensityEdge_RNA | -0.587 | | 103 | Cytoplasm.Cytoplasm_Intensity_MaxIntensityEdge... | -0.583 | | Saliency V2 | | | |-------------|---------------------------------------------------|--------| | Features | Correlation | | | 343 | Cytoplasm.Cytoplasm_Intensity_MaxIntensity_RNA | -0.460 | | 1070 | Nuclei.Nuclei_Intensity_MaxIntensityEdge_RNA | -0.458 | | 87 | Cytoplasm.Cytoplasm_Intensity_MaxIntensityEdge... | -0.458 | | 809 | Cytoplasm.Cytoplasm_Intensity_StdIntensity_RNA | -0.455 | | 463 | Nuclei.Nuclei_Intensity_StdIntensityEdge_RNA | -0.453 | | 1009 | Nuclei.Nuclei_RadialDistribution_RadialCV_ER_4of4 | -0.450 | | 811 | Nuclei.Nuclei_RadialDistribution_RadialCV_RNA_... | -0.446 | | 402 | Nuclei.Nuclei_Intensity_MaxIntensityEdge_ER | -0.442 | | 1135 | Cytoplasm.Cytoplasm_Intensity_MaxIntensityEdge_ER | -0.441 | | 625 | Nuclei.Nuclei_Intensity_StdIntensityEdge_ER | -0.438 | | 151 | Nuclei.Nuclei_RadialDistribution_RadialCV_ER_3of4 | -0.435 | | 257 | Nuclei.Nuclei_Intensity_MaxIntensity_ER | -0.435 | | 86 | Nuclei.Nuclei_Intensity_MeanIntensityEdge_RNA | -0.430 | | 172 | Cytoplasm.Cytoplasm_Intensity_StdIntensityEdge... | -0.429 | | 1283 | Nuclei.Nuclei_Intensity_StdIntensity_ER | -0.424 | | 17 | Cytoplasm.Cytoplasm_Intensity_MADIntensity_RNA | -0.421 | | 191 | Nuclei.Nuclei_Intensity_UpperQuartileIntensity... | -0.420 | | 567 | Nuclei.Nuclei_Intensity_MeanIntensity_RNA | -0.418 | | 1121 | Cells.Cells_Intensity_StdIntensity_ER | -0.417 | | 803 | Cytoplasm.Cytoplasm_Intensity_StdIntensity_ER | -0.417 | | Saliency V3 | | | |-------------|---------------------------------------------------|--------| | Features | Correlation | | | 526 | Cytoplasm.Cytoplasm_Intensity_MinIntensity_Bri... | -0.493 | | 1126 | Cells.Cells_Intensity_MinIntensity_Brightfield | -0.464 | | 1296 | Cytoplasm.Cytoplasm_Granularity_1_Brightfield | -0.459 | | 1140 | Cells.Cells_Granularity_1_Brightfield | -0.458 | | 1002 | Nuclei.Nuclei_Granularity_1_Brightfield | -0.434 | | 148 | Cytoplasm.Cytoplasm_Intensity_MinIntensityEdge... | -0.387 | | 412 | Cells.Cells_Intensity_MinIntensityEdge_Brightf... | -0.377 | | 913 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | -0.194 | | 1198 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | -0.192 | | 1219 | Cytoplasm.Cytoplasm_RadialDistribution_RadialC... | -0.181 | | 1226 | Cytoplasm.Cytoplasm_RadialDistribution_MeanFra... | -0.176 | | 1206 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | -0.175 | | 950 | Cytoplasm.Cytoplasm_RadialDistribution_MeanFra... | -0.169 | | 1071 | Nuclei.Nuclei_Correlation_K_ER_Mito | -0.168 | | 348 | Nuclei.Nuclei_Intensity_MassDisplacement_Mito | -0.163 | | 277 | Nuclei.Nuclei_RadialDistribution_RadialCV_Mito... | -0.162 | | 384 | Cytoplasm.Cytoplasm_RadialDistribution_FracAtD... | -0.160 | | 545 | Cytoplasm.Cytoplasm_Intensity_MassDisplacement... | -0.158 | | 44 | Cytoplasm.Cytoplasm_Correlation_K_RNA_AGP | -0.154 | | 1175 | Nuclei.Nuclei_Intensity_MinIntensityEdge_Brigh... | -0.152 | | Saliency V4 | | | |-------------|---------------------------------------------------|--------| | Features | Correlation | | | 343 | Cytoplasm.Cytoplasm_Intensity_MaxIntensity_RNA | -0.614 | | 1070 | Nuclei.Nuclei_Intensity_MaxIntensityEdge_RNA | -0.612 | | 87 | Cytoplasm.Cytoplasm_Intensity_MaxIntensityEdge... | -0.611 | | 463 | Nuclei.Nuclei_Intensity_StdIntensityEdge_RNA | -0.607 | | 811 | Nuclei.Nuclei_RadialDistribution_RadialCV_RNA_... | -0.583 | | 81 | Cells.Cells_Intensity_MaxIntensityEdge_RNA | -0.579 | | 809 | Cytoplasm.Cytoplasm_Intensity_StdIntensity_RNA | -0.575 | | 1165 | Nuclei.Nuclei_RadialDistribution_RadialCV_DNA_... | -0.572 | | 0 | Cells.Cells_Intensity_StdIntensityEdge_RNA | -0.571 | | 389 | Cells.Cells_Intensity_StdIntensityEdge_DNA | -0.560 | | 172 | Cytoplasm.Cytoplasm_Intensity_StdIntensityEdge... | -0.557 | | 692 | Cells.Cells_Intensity_MaxIntensity_RNA | -0.552 | | 663 | Cells.Cells_Intensity_MaxIntensityEdge_DNA | -0.552 | | 793 | Nuclei.Nuclei_Intensity_MaxIntensity_RNA | -0.549 | | 86 | Nuclei.Nuclei_Intensity_MeanIntensityEdge_RNA | -0.549 | | 1135 | Cytoplasm.Cytoplasm_Intensity_MaxIntensityEdge_ER | -0.547 | | 191 | Nuclei.Nuclei_Intensity_UpperQuartileIntensity... | -0.547 | | 1283 | Nuclei.Nuclei_Intensity_StdIntensity_ER | -0.546 | | 469 | Nuclei.Nuclei_Intensity_StdIntensity_RNA | -0.537 | | 151 | Nuclei.Nuclei_RadialDistribution_RadialCV_ER_3of4 | -0.536 |
Summing the correlations of V0 and V1 | Saliency V0 + Saliency V1 | | | |---------------------------|-------------------------------------------------------------------|-------------| | | Features | Corr. sum | | 515 | Cells.Cells_AreaShape_MeanRadius | 1.189 | | 484 | Cells.Cells_AreaShape_MedianRadius | 1.173 | | 187 | Cells.Cells_AreaShape_MaximumRadius | 1.172 | | 874 | Cells.Cells_AreaShape_Area | 1.130 | | 459 | Cells.Cells_AreaShape_MinorAxisLength | 1.116 | | 535 | Cytoplasm.Cytoplasm_AreaShape_Area | 1.114 | | 1094 | Cytoplasm.Cytoplasm_AreaShape_MinorAxisLength | 1.100 | | 130 | Cytoplasm.Cytoplasm_AreaShape_MinFeretDiameter | 1.097 | | 1048 | Cells.Cells_AreaShape_MinFeretDiameter | 1.097 | | 295 | Cells.Cells_Intensity_IntegratedIntensity_Brightfield | 1.082 | | 338 | Cytoplasm.Cytoplasm_Correlation_K_DNA_Brightfield | 1.072 | | 651 | Cytoplasm.Cytoplasm_Intensity_IntegratedIntensity_Brightfield | 1.070 | | 770 | Nuclei.Nuclei_AreaShape_MeanRadius | 1.047 | | 985 | Cells.Cells_Neighbors_SecondClosestDistance_Adjacent | 1.042 | | 208 | Cells.Cells_Neighbors_FirstClosestDistance_Adjacent | 1.038 | | 851 | Nuclei.Nuclei_AreaShape_MedianRadius | 1.028 | | 565 | Cytoplasm.Cytoplasm_AreaShape_Perimeter | 1.018 | | 1306 | Cells.Cells_AreaShape_Perimeter | 1.017 | | 311 | Cytoplasm.Cytoplasm_AreaShape_MedianRadius | 0.994 | | 825 | Cytoplasm.Cytoplasm_Intensity_IntegratedIntensityEdge_Brightfield | 0.984 | | Saliency V0 + Saliency V1 | | | |---------------------------|----------------------------------------------------------|-------------| | | Features | Corr. sum | | 439 | Cells.Cells_Intensity_MeanIntensityEdge_DNA | -1.210 | | 1309 | Cytoplasm.Cytoplasm_Intensity_MeanIntensityEdge_DNA | -1.188 | | 828 | Cytoplasm.Cytoplasm_Intensity_MeanIntensity_DNA | -1.174 | | 72 | Cytoplasm.Cytoplasm_Intensity_UpperQuartileIntensity_DNA | -1.156 | | 733 | Cytoplasm.Cytoplasm_Intensity_StdIntensity_DNA | -1.128 | | 1243 | Cells.Cells_Intensity_MeanIntensityEdge_RNA | -1.109 | | 986 | Cytoplasm.Cytoplasm_Intensity_MeanIntensityEdge_RNA | -1.076 | | 389 | Cells.Cells_Intensity_StdIntensityEdge_DNA | -1.064 | | 930 | Cytoplasm.Cytoplasm_Correlation_K_Mito_DNA | -1.060 | | 303 | Nuclei.Nuclei_Intensity_MeanIntensityEdge_DNA | -1.015 | | 438 | Cells.Cells_Intensity_MedianIntensity_DNA | -1.006 | | 289 | Cytoplasm.Cytoplasm_Correlation_K_Brightfield_DNA | -0.999 | | 725 | Cytoplasm.Cytoplasm_Intensity_MedianIntensity_DNA | -0.998 | | 663 | Cells.Cells_Intensity_MaxIntensityEdge_DNA | -0.993 | | 81 | Cells.Cells_Intensity_MaxIntensityEdge_RNA | -0.992 | | 327 | Cytoplasm.Cytoplasm_Intensity_MADIntensity_DNA | -0.970 | | 158 | Cytoplasm.Cytoplasm_Intensity_MeanIntensity_RNA | -0.965 | | 84 | Cells.Cells_Intensity_MeanIntensityEdge_AGP | -0.965 | | 1298 | Cells.Cells_Intensity_MeanIntensity_DNA | -0.963 | | 231 | Cells.Cells_Intensity_MinIntensity_DNA | -0.958 |
EchteRobert commented 2 years ago

MOA matching results (preliminary)

Below are the mean average precision values for matching sister compounds using the model, baseline, or random shuffling.

Stain2 ![Screen Shot 2022-06-09 at 2 59 30 PM](https://user-images.githubusercontent.com/62173977/172923788-5e80fb99-bdfd-49ef-adfc-5eca3b8df76e.png)
Stain3 ![Screen Shot 2022-06-09 at 3 00 03 PM](https://user-images.githubusercontent.com/62173977/172923880-a99d42e5-7974-49f7-8256-a12eaae4a465.png)
Stain4 ![Screen Shot 2022-06-09 at 3 00 24 PM](https://user-images.githubusercontent.com/62173977/172923937-66e764e4-1c44-4522-9a74-dd18165f2fc4.png)
EchteRobert commented 2 years ago

One more experiment... (ellipsoid prediction)

Just as a last test, I evaluated the trained model (on 15 plates) on the generated ellipsoid data. If you need a refresher on the experimental setup: https://github.com/broadinstitute/FeatureAggregation_single_cell/issues/3#issuecomment-1098357632 I am still using 2 dimensions to describe the ellipsoids, but I added 1322 empty dimensions to make the input fit into the model. This should be a trivial experiment as the model has already shown that it is able to beat the baseline, and thus is able to learn more than the mean. However, the theory is now that it is applying some form of quality control. If that means it is selecting cells which accurately describe the second moments of the cell set distribution than this task should always be completed perfectly. However, if it is also selecting cells which have a profile close to the mean it will not. It's also possible that the model is actually generating higher order moments from the input data and creating a profile based on that information.

Because I am using only 2 dimensions, I will roll the 2 dimensions over the 1324 available positions to see if this influences the models output. I plot the mAP as a function of the rolled dimensions. Although not exactly, this is an indicator of what features (according to their position) the model is using more than others. Low scores correspond to feature positions that little attention is paid to while the opposite is true for high scores. Moreover, this means that the AreaShape, IntegratedIntensity and Neighbors features are unavailable in some cases.

Main takeaways

@shntnu @johnarevalo I wonder what your thoughts are on this. Does this make sense or did I miss something?

mAP versus feature dimension position _On the x-axis: feature dimension position (where all the way to the left is 0 and all the way to the right is the last (aka 1323th) position. On the y-axis the mAP for all classes (10) averaged over 4 samples._ ![Figure_3](https://user-images.githubusercontent.com/62173977/173402083-68b7c97c-3523-4062-8299-9b4c4e302839.png)
mAP versus feature dimensions position with feature overlay _All AreaShape, IntegratedInstensity, and Neighbors features highlighted in yellow_ ![AllfeatsTrue](https://user-images.githubusercontent.com/62173977/173416034-9cf55789-d3de-4aaa-85e9-827ad336f33f.png) _All AreaShape and Neighbors features highlighted in yellow_ ![AreaShapeandNeighbors](https://user-images.githubusercontent.com/62173977/173416041-782c2585-fa07-4577-90f0-a7a71027d007.png) (I have also analyzed all the other features independently, but since they revealed no structure I am leaving them out of the analysis here)