Closed shntnu closed 2 years ago
@jatinarora-upmc asked
My specific doubt/question is about znf436 which is associated with cell solidity and nuclei’s granularity. If you look at the image of cell lines having variant in znf436 (the bottom one in red circle), there are a lot more cells than wild type (upper one). Now given that cell count is measured after plating, fixing and staining AND it was not enough time for the cell to proliferate (i guess so), am confused whether the observed higher count is due to disrupted znf436 (which makes biological sense) or the number of platted cells at beginning (for which we don’t have information) .
Now given that cell count is measured after plating, fixing and staining AND it was not enough time for the cell to proliferate (i guess so), am confused whether the observed higher count is due to disrupted znf436 (which makes biological sense) or the number of platted cells at beginning (for which we don’t have information) .
@jatinarora-upmc I don't recollect that there was enough time for the cells to proliferate (@mtegtmey might be able to confirm)
If there wasn't enough time, the cell counts you have in the _count.csv
files should be directly proportional to the plating density.
@shntnu i am summarizing few points (sorry for redundancy):
Does it make sense?
3. So, it is difficult (may be not possible with given dataset) to disentangle the technical differences
Can you make use of the fact that we have 8 (technical) replicates of each cell line?
@shntnu oh yeah, that's a good things to remember. I guess if we have high cell count across all replicates for a cell line, then it is more likely to that cell line's intrinsic property. right?
@jatinarora-upmc I
@shntnu oh yeah, that's a good things to remember. I guess if we have high cell count across all replicates for a cell line, then it is more likely to that cell line's intrinsic property. right?
I would think so. However it is possible that all replicates are all plated at the same time, which could drive their cell counts to be similar (because they all had the same amount of time to grow), so one may not be able to fully untangle this. Worth checking with Emily (she isn't on GitHub)
@jatinarora-upmc my hunch would be that the higher cell counts per replicate are more closely linked to the cell count at plating, as opposed to a cell line specific phenotype (i.e. proliferation). It would be very tough to untangle this without combining a proliferation screen to compare with.
@shntnu @mtegtmey these plots show the number of cells imaged per replicate for cell lines having and not having rare variants certain genes. there are few observations here:
@shntnu @raldanehme @AnneCarpenter @bethac07 @mtegtmey tagging you here for follow up on znf436's associations.
In the last v2f call, we wondered whether the observed associations of rare variant burden in znf436 with Cells_AreaShape_Zernike_5_1 (effect size = -1.44) and Cells_AreaShape_Solidity (effect size = -1.05) in intermediate cells (1 to 3 neighbors) were confounded by differential cell count per cell line. So I looked at znf436's associations in isolate cells (0 neighbors), and it is quite associated with cell area & shape properties there as well (less than just p < 0.05). Here are a few those associations.
This lends a strong support that znf436's associations with cell area & shape in intermediate cell are not confounded. Q1: Agreed? Q2: I know zernike higher order contains less information, but am confused about what the difference between zernike_0_0 and higher orders (6_4 or 3_1) would be? I am trying to understand the opposite effect of znf436's disruption on these traits
Q1: it's a great sign that you still see the association even in isolated cells. I think if there are tons of cells, the isolated cells are still affected in the sense that they don't have much place to go without touching so if they don't want to touch others they might stay pretty rounded. In other words, cells don't need to physically touch each other to still be influenced by them.
Q2: Here is a guide to the Zernikes: https://en.wikipedia.org/wiki/File:Zernike_polynomials2.png Zernike0_0 should honestly have almost perfect correlation with one of the more commonly named shape metrics because it's really asking whether the cell matches a circle shape. For 3_1 you look at that pyramid for the one that says Z with a 1 on top and a 3 on the bottom (I think). You can see it has a red and blue stripe at the edges, and a red and blue blob in the middle. What this means: picture the shape of the cell superimposed on top… it will score high for this Zernike the more blue is covered and the more red you see - our cells aren't allowed to have holes in them, so i can imagine two cell shapes that would score highly: one is almost a perfect circle but just a little flattened at the red side. The other would be almost a crescent such that the middle red blob is exposed (but it’s not a great fit because a big chunk wouldn’t align well). 6_4 isn’t shown but you can follow the right hand side of the pyramid and see it would be mostly a circle with wiggly edges (probably not far off from a circle!). I'm a bit surprised that they'd be anticorrelated to 0_0, really.
@AnneCarpenter this was a great explanation of zernike moments, and really helped me. thanks so much ! Q2: so cells become more round (higher zernike 0_0 and less 3_1 and 6_4) with rare variant burden in znf436. Zernike_0_0 is not positively correlated with all other moments. Indeed, zernike 0_0 has correlation > 0 with those having values like 2_2, 3_3, 6_6 etc. and correlation < 0 with other zernike moments like 3_1, 8_6 etc.
@jatinarora-upmc also see this note https://github.com/broadinstitute/cmQTL/issues/32#issuecomment-648410790 which has some more intuitions about Zernike
In https://github.com/broadinstitute/cmQTL/issues/44#issuecomment-648384595 we see that both 6_4
and 3_1
have good replicate correlation (i.e. repeated measurements of the same features in the same cell line tend to be similar, across the experiment), so you're in good shape here.
Below, we see that 0_0
is correlated with other Area_Shape features (when looking at the well-level averaging of profiles); Extent is the easiest to explain
Extent: The proportion of the pixels (2D) or voxels (3D) in the bounding box that are also in the region. Computed as the area/volume of the object divided by the area/volume of the bounding box.
library(tidyverse)
library(magrittr)
df <- read_csv("1.profile-cell-lines/profiles/BR00106708_augmented.csv")
df1 <-
df %>%
select(matches("Cells_AreaShape")) %>%
select(!matches("Cells_AreaShape_Center_Z")) %>%
select(!matches("Cells_AreaShape_Z"))
df2 <-
df %>%
select(matches("Cells_AreaShape_Z"))
df3 <-
bind_cols(df2 %>%
select(Cells_AreaShape_Zernike_0_0),
df1)
names(df3) <- str_remove_all(names(df3), "Cells_AreaShape_")
df3 %<>%
pivot_longer(-Zernike_0_0)
ggplot(df3, aes(Zernike_0_0, value)) + geom_hex() + facet_wrap(~name, scales = "free_y")
Also, I confirm that those 3 Zernike features are indeed (anti) correlated but I don't have an intuition as to why
df4 <-
df %>%
select(Cells_AreaShape_Zernike_0_0,
Cells_AreaShape_Zernike_3_1,
Cells_AreaShape_Zernike_6_4)
GGally::ggpairs(df4)
@shntnu oh that thread about cell zernike with soumya and you is even pasted in my notebook. It was really helpful in seeding the concept of morphology in my mind. Thanks much for tagging it here :)
nice, i usually take features with replicate correlation > 0.5, and great that 6_4 and 3_1 were there.
so, it seems the concern about znf436's associations being confounded by cell count is significantly lesser now. I would also cross-check them by taking into account proliferation time as well.
From: Arora, Jatin jarora1@bwh.harvard.edu
Hello Anne, and everyone, May I ask for a little more guidance? You might remember that rare variant burden in znf436 was associated with multiple morphology traits. Znf436 is a component of MAPK pathways, and negatively regulates response to external stimuli cell proliferation. I am trying to make sense of two associations of znf436:Nuclei’s granularity measured from Brightfield and Cell Areashape Zernike 5_1. I have attached a slide showing associations and well images. (a) I am thinking that disruption of znf436’s function would decrease its regulatory effect in MAPK, leading to higher proliferation. Brightfield might not be much informative, but would it be safe to say that Nuclei’s granularity from Brightfield captures transcriptional activity? (b) Zernike 5_1 is an high order measurement, and might not be detectable with eyes in images (please correct me if am wrong). Given the two associations, does it make sense to say that, due to higher proliferation, cells squeeze to accommodate neighbors, which resulted into lower Cell Areashape Zernike 5_1.
From: Beth Cimini bcimini@broadinstitute.org
I'm not sure we can realistically say either of those things with any high degree of confidence; a lot of things might change granularity, particularly in the brightfield channel, and if the cells are more "squeezed" I would feel much more comfortable saying that based on something like area or axis length rather than a hard-to-interpret Zernike.
From: Anne Carpenter anne@broadinstitute.org
I agree with what Beth said. It's worth noting that we have seen strong signatures of MAPK pathways in past experiments, when overexpressing genes (rather than mutating as in your project). So that is not a surprise and is reassuring - you can search our attached prior paper for MAPK and read more. From a big picture, when the cells are clearly so crowded as they are I would say the dominant phenotype is proliferation and any morphology changes probably have to do with cells being crowded rather than an inherent phenotype. But indeed, the Zernike metric that scored is about the cells being almost-round with just one side of the cell not being perfectly filled out to the edge of the circle. I do think this is a cohesive sensible story for this gene! It's not a very morphologically complex/unique story but at least it is sensible.
From: Ralda Nehme rnehme@broadinstitute.org
Do you see an association when examining only single, isolated cells (i.e. ones with no neighbors)? If the phenotype is related to density (cells being squeezed because of being crowded) then you wouldn't expect to see it in single/isolated cells. Otherwise, it could be "inherent".
From: Arora, Jatin jarora1@bwh.harvard.edu
Thanks so much everyone for your knowledge, and also for the paper on mapk.