Open NasimJ opened 2 years ago
Notebook for creating the experiment metadata, and additional columns in the metadata were added later manually.
Scope_vendor | Batch | Plate_Name | Percent_Replicating |
---|---|---|---|
MolDev | Scope1_MolDev_10X | Plate2_PCO_6ch_4site_10XPA | 33.3 |
MolDev | Scope1_MolDev_10X | Plate3_PCO_6ch_4site_10XPA_Crest | 50 |
MolDev | Scope1_MolDev_10X_4siteZ | Plate3_PCO_6ch_4site_10XPA_Crestz | 52.2 |
MolDev | Scope1_MolDev_20X_4site | Plate3_PCO_6ch_4site_20XPA_Crestz | 43.3 |
MolDev | Scope1_MolDev_20X_9site | Plate2_PCO_6ch_9site_20XPA | 56.7 |
MolDev | Scope1_MolDev_20X_9site | Plate3_PCO_6ch_9site_20XPA_Crest | 50 |
MolDev | Scope1_MolDev_20X_Adaptive | Plate3_PCO_6ch_Adaptive_20XPA | 17.8 |
Nikon | Scope1_Nikon_10X | BR00117060a10x | 26.7 |
Nikon | Scope1_Nikon_10X | BR00117061a10x | 39.8 |
Nikon | Scope1_Nikon_10X | BR00117062a10x | 27.8 |
Nikon | Scope1_Nikon_10X | BR00117063b10x | 33.7 |
Nikon | Scope1_Nikon_20X | BR00117061a | 58.9 |
Nikon | Scope1_Nikon_20X | BR00117062a | 43.3 |
Nikon | Scope1_Nikon_20X | BR00117063b | 46.7 |
PE | Scope1_PE_Bin1_Confocal_1Plane | CP_Broad_Phenix_C_BIN1_1Plane_P1 | 50 |
PE | Scope1_PE_Bin1_Confocal_1Plane | CP_Broad_Phenix_C_BIN1_1Plane_P2 | 53.3 |
PE | Scope1_PE_Bin1_Confocal_1Plane | CP_Broad_Phenix_C_BIN1_1Plane_P3 | 51.1 |
PE | Scope1_PE_Bin1_Confocal_1Plane | CP_Broad_Phenix_C_BIN1_1Plane_P4 | 44.4 |
PE | Scope1_PE_Bin1_Confocal_3Plane | CP_Broad_Phenix_C_BIN1_P1 | 53.3 |
PE | Scope1_PE_Bin1_Confocal_3Plane | CP_Broad_Phenix_C_BIN1_P2 | 56.7 |
PE | Scope1_PE_Bin1_Confocal_3Plane | CP_Broad_Phenix_C_BIN1_P3 | 53.3 |
PE | Scope1_PE_Bin1_Confocal_3Plane | CP_Broad_Phenix_C_BIN1_P4 | 44.4 |
PE | Scope1_PE_Bin1_Widefield_1Plane | CP_Broad_Phenix_NC_BIN1_1Plane_P1 | 52.2 |
PE | Scope1_PE_Bin1_Widefield_1Plane | CP_Broad_Phenix_NC_BIN1_1Plane_P2 | 54.4 |
PE | Scope1_PE_Bin1_Widefield_1Plane | CP_Broad_Phenix_NC_BIN1_1Plane_P3 | 51.1 |
PE | Scope1_PE_Bin1_Widefield_1Plane | CP_Broad_Phenix_NC_BIN1_1Plane_P4 | 46.7 |
PE | Scope1_PE_Bin1_Widefield_3Plane | CP_Broad_Phenix_NC_BIN1_P1 | 53.3 |
PE | Scope1_PE_Bin1_Widefield_3Plane | CP_Broad_Phenix_NC_BIN1_P2 | 54.4 |
PE | Scope1_PE_Bin1_Widefield_3Plane | CP_Broad_Phenix_NC_BIN1_P3 | 47.8 |
PE | Scope1_PE_Bin1_Widefield_3Plane | CP_Broad_Phenix_NC_BIN1_P4 | 46.7 |
PE | Scope1_PE_Bin2_Confocal_1Plane | CPBroadPhenixC1PlaneP1 | 50 |
PE | Scope1_PE_Bin2_Confocal_1Plane | CPBroadPhenixC1PlaneP2 | 51.1 |
PE | Scope1_PE_Bin2_Confocal_1Plane | CPBroadPhenixC1PlaneP3 | 53.3 |
PE | Scope1_PE_Bin2_Confocal_1Plane | CPBroadPhenixC1PlaneP4 | 45.6 |
PE | Scope1_PE_Bin2_Confocal_3Plane | CPBroadPhenixCP1 | 53.3 |
PE | Scope1_PE_Bin2_Confocal_3Plane | CPBroadPhenixCP2 | 55.6 |
PE | Scope1_PE_Bin2_Confocal_3Plane | CPBroadPhenixCP3 | 56.7 |
PE | Scope1_PE_Bin2_Confocal_3Plane | CPBroadPhenixCP4 | 56.7 |
PE | Scope1_PE_Bin2_Widefield_1Plane | CPBroadPhenixNC1PlaneP1 | 53.3 |
PE | Scope1_PE_Bin2_Widefield_1Plane | CPBroadPhenixNC1PlaneP2 | 53.3 |
PE | Scope1_PE_Bin2_Widefield_1Plane | CPBroadPhenixNC1PlaneP3 | 47.8 |
PE | Scope1_PE_Bin2_Widefield_1Plane | CPBroadPhenixNC1PlaneP4 | 47.8 |
PE | Scope1_PE_Bin2_Widefield_3Plane | CPBroadPhenixNCP1 | 52.2 |
PE | Scope1_PE_Bin2_Widefield_3Plane | CPBroadPhenixNCP2 | 51.1 |
PE | Scope1_PE_Bin2_Widefield_3Plane | CPBroadPhenixNCP3 | 50 |
PE | Scope1_PE_Bin2_Widefield_3Plane | CPBroadPhenixNCP4 | 46.7 |
Yokogawa_Japan | Scope1_Yokogawa_Japan_20X | 20201021T092317 | 53.3 |
Yokogawa_Japan | Scope1_Yokogawa_Japan_40X | 20201020T134356 | 37.8 |
Yokogawa_US | Scope1_Yokogawa_US_10X | BRO0117014_10x | 48.9 |
Yokogawa_US | Scope1_Yokogawa_US_20X_5Ch | BRO0117033_20xb | 57.8 |
Yokogawa_US | Scope1_Yokogawa_US_20X_5Ch | BRO0117056_20x | 50 |
Yokogawa_US | Scope1_Yokogawa_US_20X_5Ch_12Z | BRO0117056_20xb | 55.6 |
Yokogawa_US | Scope1_Yokogawa_US_20X_6Ch_BRO0117033 | BRO0117033_20x | 14.7 |
Yokogawa_US | Scope1_Yokogawa_US_20X_6Ch_BRO0117059 | BRO0117059_20X | 57.8 |
Yokogawa_US | Scope1_Yokogawa_US_20X_6Ch_BRO01177034 | BRO01177034_20x | 53.3 |
Yokogawa_US | Scope1_Yokogawa_US_40X_BRO0117059 | BRO0117059_40x | 43.3 |
Scope_vendor | Batch | Plate_Name | Percent_Matching |
---|---|---|---|
MolDev | Scope1_MolDev_10X | Plate2_PCO_6ch_4site_10XPA | 18.605 |
MolDev | Scope1_MolDev_10X | Plate3_PCO_6ch_4site_10XPA_Crest | 16.279 |
MolDev | Scope1_MolDev_10X_4siteZ | Plate3_PCO_6ch_4site_10XPA_Crestz | 20.93 |
MolDev | Scope1_MolDev_20X_4site | Plate3_PCO_6ch_4site_20XPA_Crestz | 16.279 |
MolDev | Scope1_MolDev_20X_9site | Plate2_PCO_6ch_9site_20XPA | 18.605 |
MolDev | Scope1_MolDev_20X_9site | Plate3_PCO_6ch_9site_20XPA_Crest | 13.953 |
MolDev | Scope1_MolDev_20X_Adaptive | Plate3_PCO_6ch_Adaptive_20XPA | 6.977 |
Nikon | Scope1_Nikon_10X | BR00117060a10x | 13.953 |
Nikon | Scope1_Nikon_10X | BR00117061a10x | 14.634 |
Nikon | Scope1_Nikon_10X | BR00117062a10x | 11.628 |
Nikon | Scope1_Nikon_10X | BR00117063b10x | 11.905 |
Nikon | Scope1_Nikon_20X | BR00117061a | 16.279 |
Nikon | Scope1_Nikon_20X | BR00117062a | 16.279 |
Nikon | Scope1_Nikon_20X | BR00117063b | 18.605 |
PE | Scope1_PE_Bin1_Confocal_1Plane | CP_Broad_Phenix_C_BIN1_1Plane_P1 | 16.279 |
PE | Scope1_PE_Bin1_Confocal_1Plane | CP_Broad_Phenix_C_BIN1_1Plane_P2 | 16.279 |
PE | Scope1_PE_Bin1_Confocal_1Plane | CP_Broad_Phenix_C_BIN1_1Plane_P3 | 18.605 |
PE | Scope1_PE_Bin1_Confocal_1Plane | CP_Broad_Phenix_C_BIN1_1Plane_P4 | 16.279 |
PE | Scope1_PE_Bin1_Confocal_3Plane | CP_Broad_Phenix_C_BIN1_P1 | 18.605 |
PE | Scope1_PE_Bin1_Confocal_3Plane | CP_Broad_Phenix_C_BIN1_P2 | 18.605 |
PE | Scope1_PE_Bin1_Confocal_3Plane | CP_Broad_Phenix_C_BIN1_P3 | 16.279 |
PE | Scope1_PE_Bin1_Confocal_3Plane | CP_Broad_Phenix_C_BIN1_P4 | 20.93 |
PE | Scope1_PE_Bin1_Widefield_1Plane | CP_Broad_Phenix_NC_BIN1_1Plane_P1 | 18.605 |
PE | Scope1_PE_Bin1_Widefield_1Plane | CP_Broad_Phenix_NC_BIN1_1Plane_P2 | 16.279 |
PE | Scope1_PE_Bin1_Widefield_1Plane | CP_Broad_Phenix_NC_BIN1_1Plane_P3 | 18.605 |
PE | Scope1_PE_Bin1_Widefield_1Plane | CP_Broad_Phenix_NC_BIN1_1Plane_P4 | 23.256 |
PE | Scope1_PE_Bin1_Widefield_3Plane | CP_Broad_Phenix_NC_BIN1_P1 | 16.279 |
PE | Scope1_PE_Bin1_Widefield_3Plane | CP_Broad_Phenix_NC_BIN1_P2 | 16.279 |
PE | Scope1_PE_Bin1_Widefield_3Plane | CP_Broad_Phenix_NC_BIN1_P3 | 18.605 |
PE | Scope1_PE_Bin1_Widefield_3Plane | CP_Broad_Phenix_NC_BIN1_P4 | 16.279 |
PE | Scope1_PE_Bin2_Confocal_1Plane | CPBroadPhenixC1PlaneP1 | 18.605 |
PE | Scope1_PE_Bin2_Confocal_1Plane | CPBroadPhenixC1PlaneP2 | 16.279 |
PE | Scope1_PE_Bin2_Confocal_1Plane | CPBroadPhenixC1PlaneP3 | 18.605 |
PE | Scope1_PE_Bin2_Confocal_1Plane | CPBroadPhenixC1PlaneP4 | 18.605 |
PE | Scope1_PE_Bin2_Confocal_3Plane | CPBroadPhenixCP1 | 20.93 |
PE | Scope1_PE_Bin2_Confocal_3Plane | CPBroadPhenixCP2 | 18.605 |
PE | Scope1_PE_Bin2_Confocal_3Plane | CPBroadPhenixCP3 | 18.605 |
PE | Scope1_PE_Bin2_Confocal_3Plane | CPBroadPhenixCP4 | 18.605 |
PE | Scope1_PE_Bin2_Widefield_1Plane | CPBroadPhenixNC1PlaneP1 | 18.605 |
PE | Scope1_PE_Bin2_Widefield_1Plane | CPBroadPhenixNC1PlaneP2 | 16.279 |
PE | Scope1_PE_Bin2_Widefield_1Plane | CPBroadPhenixNC1PlaneP3 | 16.279 |
PE | Scope1_PE_Bin2_Widefield_1Plane | CPBroadPhenixNC1PlaneP4 | 23.256 |
PE | Scope1_PE_Bin2_Widefield_3Plane | CPBroadPhenixNCP1 | 18.605 |
PE | Scope1_PE_Bin2_Widefield_3Plane | CPBroadPhenixNCP2 | 13.953 |
PE | Scope1_PE_Bin2_Widefield_3Plane | CPBroadPhenixNCP3 | 16.279 |
PE | Scope1_PE_Bin2_Widefield_3Plane | CPBroadPhenixNCP4 | 16.279 |
Yokogawa_Japan | Scope1_Yokogawa_Japan_20X | 20201021T092317 | 20.93 |
Yokogawa_Japan | Scope1_Yokogawa_Japan_40X | 20201020T134356 | 16.279 |
Yokogawa_US | Scope1_Yokogawa_US_10X | BRO0117014_10x | 18.605 |
Yokogawa_US | Scope1_Yokogawa_US_20X_5Ch | BRO0117033_20xb | 16.279 |
Yokogawa_US | Scope1_Yokogawa_US_20X_5Ch | BRO0117056_20x | 18.605 |
Yokogawa_US | Scope1_Yokogawa_US_20X_5Ch_12Z | BRO0117056_20xb | 23.256 |
Yokogawa_US | Scope1_Yokogawa_US_20X_6Ch_BRO0117033 | BRO0117033_20x | 26.087 |
Yokogawa_US | Scope1_Yokogawa_US_20X_6Ch_BRO0117059 | BRO0117059_20X | 20.93 |
Yokogawa_US | Scope1_Yokogawa_US_20X_6Ch_BRO01177034 | BRO01177034_20x | 20.93 |
Yokogawa_US | Scope1_Yokogawa_US_40X_BRO0117059 | BRO0117059_40x | 13.953 |
OMG SO EXCITING
Those numbers are all... shockingly consistent, especially within the 20X.
Is there anything we can say about the two plates that seem to do really poorly, Scope1_MolDev_20X_Adaptive
and Scope1_Yokogawa_US_20X_6Ch_BRO0117033
? QC issues, etc?
Scope1_Yokogawa_US_20X_6Ch_BRO0117033 is not a complete plate, the data was uploaded only up to well D21 and Scope1_MolDev_20X_Adaptive has variations for the number of sites per well (some wells with 2 sites and some with 3)
Those both make a lot of sense as to why we'd see lower numbers! I'd argue the partial plate shouldn't be kept at all.
I agree, and I wasn't planning to include them in the group analysis (i.e: analysis of magnifications and modality, etc.), but I thought having a plate level analysis would be good.
Absolutely, this is fantastic to have!
On Wed, Feb 9, 2022, 5:46 PM Nasim Jamali @.***> wrote:
I agree, and I wasn't planning to include them in the group analysis (i.e: analysis of magnifications and modality, etc.), but I thought having a plate level analysis would be good.
— Reply to this email directly, view it on GitHub https://github.com/jump-cellpainting/jump-scope-analysis/issues/1#issuecomment-1034275446, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTI727TZ6ALZOT44X42NXTU2LVFZANCNFSM5N6V5GOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you commented.Message ID: @.***>
It's amazing to see SO much data in one place. What an awesome experiment! Tacos for @NasimJ !
The nice thing about the MOA plate is that unlike our JUMP-Target plates (which only have singlicates of most compounds), on JUMP-MOA we have multiple well locations for each compound.
Can you confirm whether Percent Matching is calculated as taking a given compound and checking all of its replicates for matching to all replicates of the "sister" (same-MOA) compound? I assume we calculate the average correlation values of all of those (even though within-plate is 'easier' to match than across-plate). I think this is a fine way to calculate it. It would be more stringent to only allow matching across plates not within-plate, but our goal here isn't studying plate to plate variation so I think that would be pointless.
For Percent Replicating, it's a more serious situation where plate layout effects can inflate Percent Replicating pretty badly. So here, I think there is a 'correct' answer for how to analyze: We should take each well of a compound and only measure correlation to other-well-positions of that compound (including within-plate AND across-plate all together is fine, for the same reasons as above, in my opinion). Is that how we've done it?
And then finally a Q about overall analysis design: should we be running a version where we attempt as much of an apples to apples comparison as we can? I.e. if one vendor took 15 images per well and another 3, we limit our analysis to 3 across all plates? We are not doing a bakeoff and yet if we publish the results from each vendor some may look artificially 'bad' and people might not get the nuance that it was a technical reason like number of sites that is the difference. I'm just wondering how you are thinking about these sorts of things, and I'm not sure which variables make sense to control for, beyond number of sites.
Can you confirm whether Percent Matching is calculated as taking a given compound and checking all of its replicates for matching to all replicates of the "sister" (same-MOA) compound? I assume we calculate the average correlation values of all of those (even though within-plate is 'easier' to match than across-plate). I think this is a fine way to calculate it. It would be more stringent to only allow matching across plates not within-plate, but our goal here isn't studying plate to plate variation so I think that would be pointless.
Exactly, our goal here is not to study the plate to plate variations, so these are calculated within a plate.
For Percent Replicating, it's a more serious situation where plate layout effects can inflate Percent Replicating pretty badly. So here, I think there is a 'correct' answer for how to analyze: We should take each well of a compound and only measure correlation to other-well-positions of that compound (including within-plate AND across-plate all together is fine, for the same reasons as above, in my opinion). Is that how we've done it?
For the Percent Replicating, I believe it is measuring the median of each compound replicates within a plate (only--and not across plates). When we want to compare across plates (within one batch of a scope vendor), we need to change the code to make sure the same well correlations are perhaps excluded. @niranjchandrasekaran please correct me if I'm missing something.
And then finally a Q about overall analysis design: should we be running a version where we attempt as much of an apples to apples comparison as we can? I.e. if one vendor took 15 images per well and another 3, we limit our analysis to 3 across all plates? We are not doing a bakeoff and yet if we publish the results from each vendor some may look artificially 'bad' and people might not get the nuance that it was a technical reason like number of sites that is the difference. I'm just wondering how you are thinking about these sorts of things, and I'm not sure which variables make sense to control for, beyond number of sites.
About the overall analysis, so far I only shared the overall per plate analysis. I’m planning to dig into each scope vendor separately and compare between different batches and plates within each batch (not comparing between vendors, but for example comparing different magnifications, binning, number of sites/field per well, etc. within a scope vendor plates—to make it more of an apple to apple comparison).
Ah - When I made my suggestions, I wasn't thinking about the fact that it seems you generally only have a single plate per condition! So forgive the Qs that made no sense in that context.
"When we want to compare across plates (within one batch of a scope vendor)" --> I'm not sure you definitely even want to do this, do you? We want to compare metrics across vendors/conditions but I didn't think we wanted to check for ability to match signatures across plates.
Also note that profilers are fiercely debating how best to judge profiling experiments with these various control plates. This morning we settled for John's project (which uses Target2, so it's a different setup) to compute:
Grouped by z_plane:
Wonderful to have all this data! I find this vis a bit hard to digest. I wonder if something like this will make comparisons easier across (except I drew as bar charts, should've done just the actual dots of data you have for each).
Nice drawing! :) What I've tried seems a bit crowded and hard to digest as well (even if I move the legend outside):
Perhaps, the means and SD would be a better representation :
That is nicer to digest, though most journals do not want you to represent so few data points with a mean/SD, they want to see individual data points so ppl can see the raw data and be sure you're not hiding anything.
Going back to first principles, a visualization should answer a particular Q. In our case we could go high level:
So we could also go more granular:
Could we not do the mean/SD plot and then just add the points onto it? I like having it faceted the way that it currently is, personally.
It's your call! My rationale is there. It really comes down to; "how long does it take a person to answer the Qs they want to answer". I find it hard to answer the Qs w these plots but it might be because they are all giving super similar (and/or within the range of noise) answers and thus the answers are not very clear no matter how we plot them.
This is my attempt for the overall representation of the PE data:
8 batches Variables:
Similarities:
Overall view:
and we could then separate them (based on different columns or rows):
I can try other methods, if this is not good.
For each vendor, similar to the PE group, I'm plotting an overall view of the data in one plot, then if there are more than one variable (magnification, binning, z_plane, modality) to compare between the batches of a specific vendor data, I plot them faceted, followed by a mean and standard deviation of the calculated percent replicating and matching of the plates for that vendor's dataset.
Feature selection for the profiles were done normal (at the batch level).
All analysis are done per plate (4 replicates per plate), and I’ve calculated the percent replicating and matching for these 4 replicates within each plate, and each data point shown in the overall plot (percent replicating vs matching) is for a given plate in the vendor's dataset. So, for the analysis, each plates is independent of the other plates.
The incomplete plates, one from Yokogawa_US (Scope1_Yokogawa_US_20X_6Ch_BRO0117033) and one from Molecular Devices (Scope1_MolDev_20X_Adaptive) are removed and not included in these plots.
4 batches Variables:
Similarities:
Overall view:
Faceted view:
Means and SD:
Faceted Mean and SD:
2 batches Variables:
Similarities:
Overall view:
Nikon has only two batches (one parameter to compare:magnification and all the other parameters are the same between these two batches), so there is no faceted plot for individual plates.
Faceted view:
Means and SD:
2 batches Variables:
Similarities:
6 batches Variables:
Similarities:
Overall view:
Faceted view:
Means and SD: The data points with no error bar means there is only one plate for that batch.
Thanks, this is great! Obviously we'll need to eventually systematize/theme everything but great to just have all the comparisons so far.
Nikon has only two batches (one parameter to compare:magnification and all the other parameters are the same between these two batches), so there is no faceted plot for individual plates.
Sites per well is different, no? Also in the Yoko US and Japan? I think we need to be really clear on that, even where we aren't faceting on it, since it's one of the major hypotheses we intend to test as to WHY sometimes these values are different.
Sites per well is different, no?
Correct, but the magnification and the other variable (for Nikon sites per well, and for Yoko Japan sites per well and z_plane) are associated, so it would still be one representative data point on the percent matching vs replicating plot. For the Yoko US, I'll make an additional faceted plot for the sites per well.
Yeah, I understand we aren't going to facet on it, but would be nice to just have in the legend
Golas:
Conclusions:
Scope Vendors:
Additional information on scopes and imaging details can be found here.
Experimental details:
Plate map and compounds: JUMP-MOA plate map and compounds were used (90 compounds from 47 distinct MOA classes, with 4 replicates per compound)