Closed shntnu closed 2 weeks ago
@mtegtmey please upload the images here /imaging/analysis/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images
Images are uploaded!
For my notes because I keep looking around for the new instructions to upload from login01
:
cd /imaging/analysis/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images
mv "Matt T Cell Painting*" BR_NCP_PROGENITORS_1 # rename the image folder to a standard name
# now edit this line
# <PlateID>Matt T Cell Painting LM 12012020</PlateID>
# to this
# <PlateID>BR_NCP_PROGENITORS_1</PlateID>
emacs BR_NCP_PROGENITORS_1/Images/Index.idx.xml
reuse UGER
ish -l h_vmem=4G -pe smp 4 # get a node
workon cellpntg2 # or whatever env in which you've installed awscli
aws configure # verify you're in the right account
aws s3 sync \
/imaging/analysis/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images \
s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images
Transfer is underway
@pearlryder The images are ready for analysis They live on /imaging/analysis at
/imaging/analysis/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images
and also on S3
s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images
Feel free to pull in from either location
I think a good starting point to analyze these neuronal progenitor cells (Day 4) would be the pipeline used to analyze stem cells. For stem cells (a.k.a. NCP_STEM_1
), I had reused an existing pipeline as mentioned here https://github.com/broadinstitute/neuronal-cell-painting/issues/7#issuecomment-727283749
I can run DCP once the pipeline is configured if you prefer.
@mtegtmey could you comment on the priority for this one? Would bumping it to the new year work?
@shntnu it is high-priority, but bumping to the new year would be fine! For me, it would be ideal to try having profiles and 'feature differentials' (however you refer to them) by mid-Feb if that seems possible.
Thanks @mtegtmey ! @pearlryder feel free to make a call on prioritizing based on this info
Thanks @mtegtmey! We're going to try to process this data before the end of this year, but it's great to know that we won't be holding you back too terribly if we need to wait until January. I'll keep you updated with our progress -- you can expect to hear from me by the end of next week. Cheers!
Hi @mtegtmey and team,
I wanted to update everyone that we did have time to process these images and extracted the data over the weekend. We'll start the process of analyzing the data when I return to work in the New Year.
I hope everyone has a very happy holiday!
@pearlryder thank you so much for the update, and all your hard work getting to this point! Ralda and I so much appreciate the work all of you have done on this project, and cannot wait for all the exciting science we will get to do together over the coming years. It's a collaboration we value tremendously.
Have a wonderful holiday, 'see' you in the new year!
@pearlryder you can stop at the collate step i.e. just before https://cytomining.github.io/profiling-handbook/create-profiles.html#annotate and I'll handle things downstream
Thanks @shntnu! I should have everything uploaded to AWS by the EOD tomorrow. I'll ping you here when it's ready.
@shntnu, the analysis files are now available at s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/workspace/backend/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/.
The per-site analysis files are available at s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/
In double checking the .csv file, I noticed that 4 wells are missing data: F11, F12, O18, and P18. I looked at several images for these wells and confirmed that the wells appear to be empty / contain debris only. Please let me know if you have any questions!
Awesome! Thanks @pearlryder
I noticed that the SQLite file is 100Gb (BR_NCP_STEM_1 was 25Gb). Was the cell density high?
Yes @shntnu, most of the wells I examined were confluent. I just checked a few images from NCP_STEM_1 and they are indeed much lower density than the BR_NCP_PROGENITORS images (maybe ~ 50-75% confluency).
@shntnu This is something we should expect. The conditions for the NPCs are 15k cells per well with a 24hr incubation period (so they may proliferate) compared to 10k cells with a 6hr incubation for the stem cells.
Thanks @mtegtmey @pearlryder for clarifying!
@mtegtmey To get his off the ground – is there any specific advantage in starting with an analysis of the 4 branching metrics alone? Or would you rather just have the entire profile (4000+) features.
@shantanu Ideally all 4000+ features
On Thu, Jun 24, 2021 at 11:55 AM Shantanu Singh @.***> wrote:
@mtegtmey https://github.com/mtegtmey To get his off the ground – is there any specific advantage in starting with an analysis of the 4 branching metrics alone? Or would you rather just have the entire profile (4000+) features.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/neuronal-cell-painting/issues/10#issuecomment-867755069, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMSE5ES3MSFCYVJH5LO5AZ3TUNIPDANCNFSM4UMKM2GA .
Sounds good
PS – you are tagging the wrong Shantanu :D This is a private repo so we are good. I'm @shntnu
I am looking at the D4 data. It seems right now that the inter-human variation is greater than the difference between controls and deletion in the progenitors.
If subjects 5,6 and 33 are removed:
However, there are still features which distinguish deletions from controls. There were 122 features which were statistically significant in control vs deletion in both stem cells and progenitors (out of 300+ features effective). Of those about 75% went in the same direction. I'll do some supervised methods and linear models to see if we can reliably distinguish controls from deletions despite the inter-human variation.
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
@ruifanp We wanted to see sample images for this plate
Please follow the steps here to do so
https://github.com/cytomining/cytoplot/issues/8#issuecomment-619273730
Ping me when you are stuck because I bet there are missing pieces of info
Oh, you will first need to download the images of course
NCP_PROGENITORS_1
):datasets <-
tribble(
~batch, ~plate,
"NCP_PROGENITORS_1", "BR_NCP_PROGENITORS_1"
)
Note that you will need to run these lines on the command line to download the images: https://github.com/broadinstitute/neuronal-cell-painting/blob/4ebc15074bb05a7e7ea09fe2a041d1a368d0a8a4/1.run-workflows/3.select_images_to_print.Rmd#L142-L158
BR_NCP_PROGENITORS_1
. Please follow the steps here to do so https://github.com/cytomining/cytoplot/issues/8#issuecomment-619273730@ruifanp can you please have this https://github.com/broadinstitute/neuronal-cell-painting/issues/10#issuecomment-886546515 squared away this week and tag @mtegtmey when you're done?
@ruifanp @shntnu any updates to this? I want to push a repeat experiment ASAP if necessary.
This was waiting on me over the last few days; sorry!
We needed to restore files that had gotten archived.
I am doing so now using this script https://github.com/broadinstitute/imaging-backup-scripts/blob/master/restore_intelligent.py
Will report back and then cc Beth to let her know it worked.
/usr/local/opt/python@3.9/bin/python3.9 \
restore_intelligent.py \
imaging-platform \
projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1
Output (snapshot after 1 minute)
20750 total files found pre-filtering
20750 total files remain post-filtering
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/illum/
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/illum/BR_NCP_PROGENITORS_1/
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/illum/BR_NCP_PROGENITORS_1/cp.is.done
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images/BR_NCP_PROGENITORS_1/FFC_Profile/FFC_Profile_Measurement 2.xml
Sent 100 restore requests
...
The next day
...
Sent 20200 restore requests
Sent 20300 restore requests
Sent 20400 restore requests
Sent 20500 restore requests
Sent 20600 restore requests
Sent 20700 restore requests
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images/images
Sent all restore requests
These requests worked but several of the outline images did not
/usr/local/opt/python@3.9/bin/python3.9 \
restore_intelligent.py \
imaging-platform \
projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/ \
--filter_in "png" --filter_out "csv"
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-H15-5/outlines/H15_s5--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-L01-4/outlines/L01_s4--cell_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-E04-6/outlines/E04_s6--cell_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-K01-8/outlines/K01_s8--cell_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-O06-8/outlines/O06_s8--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-E03-1/outlines/E03_s1--cell_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-L05-8/outlines/L05_s8--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-G18-7/outlines/G18_s7--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-F08-1/outlines/F08_s1--cell_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-C22-5/outlines/C22_s5--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-F11-1/outlines/F11_s1--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-P18-9/outlines/P18_s9--cell_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-M07-9/outlines/M07_s9--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-G02-6/outlines/G02_s6--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-P02-2/outlines/P02_s2--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-H01-2/outlines/H01_s2--cell_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-F08-6/outlines/F08_s6--nuclei_outlines.png
Could not restore object projects/2019_05_28_Neuronal_Cell_Painting/workspace/analysis/NCP_PROGENITORS_1/BR_NCP_PROGENITORS_1/analysis/BR_NCP_PROGENITORS_1-G01-1/outlines/G01_s1--nuclei_outlines.png
Channel maps
<Entry ChannelID="1">
<FlatfieldProfile>{Background: {Character: NonFlat, Mean: 276.82462, NoiseConst: 4.4794638, NonFlatness: {Corrected: 0.061573386, Original: 0.45784867, Random: 0.026914451}, Profile: {Coefficients: [[1.1537], [-0.1952, -0.0629], [-0.786, 0.0164, -0.9893], [0.4016, 0.5166, 0.2448, 0.0266], [-0.9321, 0.2253, 1.0641, -0.2648, -0.1231]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Channel: 1, ChannelName: Alexa 647, Foreground: {Character: NonFlat, NonFlatness: {Original: 0.61926347, Random: 0.038961733}, Profile: {Coefficients: [[1.2289], [-0.122, -0.2808], [-0.9349, 0.1698, -1.8379], [0.5493, 0.9854, 0.1863, 0.4485], [-1.6299, 0.0643, 1.7424, -0.8028, 0.8358]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Version: Acapella:2013}</FlatfieldProfile>
</Entry>
<Entry ChannelID="2">
<FlatfieldProfile>{Background: {Character: NonFlat, Mean: 219.39228, NoiseConst: 2.0131553, NonFlatness: {Corrected: 0.032956451, Original: 0.36471289, Random: 0.027722657}, Profile: {Coefficients: [[1.1399], [-0.0477, -0.058], [-0.8334, 0.131, -0.8225], [0.0643, 0.3933, 0.061, -0.0106], [-0.6338, -0.4985, 1.387, -0.1587, -0.2868]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Channel: 2, ChannelName: Alexa 568, Foreground: {Character: NonFlat, NonFlatness: {Original: 0.64135826, Random: 0.03817036}, Profile: {Coefficients: [[1.2321], [-0.271, -0.2332], [-0.936, 0.3731, -1.8427], [0.4359, 0.9875, 0.4462, 0.2329], [-1.9193, -1.1965, 1.9475, -0.8998, 0.7939]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Version: Acapella:2013}</FlatfieldProfile>
</Entry>
<Entry ChannelID="3">
<FlatfieldProfile>{Background: {Character: NonFlat, Mean: 624.45183, NoiseConst: 1.3, NonFlatness: {Corrected: 0.031488486, Original: 0.60247177, Random: 0.017767787}, Profile: {Coefficients: [[1.242], [-0.0973, -0.0543], [-1.4894, 0.1219, -1.5797], [0.0735, 0.1321, 0.319, -0.0946], [-0.2274, -0.3055, 2.4095, -0.2666, -0.0135]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Channel: 3, ChannelName: 488 long, Foreground: {Character: NonFlat, NonFlatness: {Original: 0.70948625, Random: 0.038870561}, Profile: {Coefficients: [[1.2971], [-0.1707, -0.0873], [-1.9097, 0.314, -2.1103], [0.3832, 0.1762, 0.4525, 0.0455], [0.6746, -1.0848, 2.6936, -0.5221, 0.8574]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Version: Acapella:2013}</FlatfieldProfile>
</Entry>
<Entry ChannelID="4">
<FlatfieldProfile>{Background: {Character: NonFlat, Mean: 805.56198, NoiseConst: 1.3, NonFlatness: {Corrected: 0.033651829, Original: 0.5953663, Random: 0.024583519}, Profile: {Coefficients: [[1.2396], [-0.0898, -0.0492], [-1.5413, 0.1165, -1.4906], [0.1098, 0.1265, 0.2704, 0.0067], [0.2785, -0.0943, 1.7308, -0.4344, -0.1939]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Channel: 4, ChannelName: Alexa 488, Foreground: {Character: NonFlat, NonFlatness: {Original: 0.77742112, Random: 0.038704636}, Profile: {Coefficients: [[1.3157], [-0.1696, -0.073], [-1.9486, 0.2803, -1.9417], [0.4557, 0.1292, 0.3916, 0.1484], [-0.0643, -0.964, 3.0171, -0.4453, -0.9293]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Version: Acapella:2013}</FlatfieldProfile>
</Entry>
<Entry ChannelID="5">
<FlatfieldProfile>{Background: {Character: NonFlat, Mean: 619.83839, NoiseConst: 2.3488395, NonFlatness: {Corrected: 0.044228606, Original: 0.69694823, Random: 0.021729838}, Profile: {Coefficients: [[1.2188], [-0.1118, 0.0117], [-0.8487, 0.0394, -1.0215], [-0.0525, 0.1703, 0.2001, -0.2294], [-2.9741, -0.5146, 0.6968, 0.1254, -2.4524]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Channel: 5, ChannelName: HOECHST 33342, Foreground: {Character: NonFlat, NonFlatness: {Original: 0.78268158, Random: 0.038722602}, Profile: {Coefficients: [[1.2369], [-0.2093, -0.0045], [-0.8884, 0.081, -1.197], [0.1479, 0.2561, 0.4074, -0.2115], [-2.5074, -0.5371, -0.6543, -0.0151, -2.1817]], Dims: [2160, 2160], Origin: [1079.5, 1079.5], Scale: [0.00046296296, 0.00046296296], Type: Polynomial}, Quality: 1.0}, Version: Acapella:2013}</FlatfieldProfile>
</Entry>
<Entry ChannelID="6">
<FlatfieldProfile>{Background: {Character: Null, Profile: {Type: Identity}, Quality: 1}, Channel: 6, ChannelName: Brightfield CP, Foreground: {Character: Flat, Profile: {Type: Identity}, Quality: 1}, Version: Acapella:2013}</FlatfieldProfile>
</Entry>
This is the montage for the 48 human and/or samples images. There are several samples which seem to be blank or have very little cell count.
Montage for the DAPI channel only, with labels for the row and column it came from. Upon inspecting the cell counts, the images which are mostly blank correspond to the wells which have nonsensically low cell counts.
Cell counts by well
@ruifanp will plot cell counts similar to https://github.com/broadinstitute/neuronal-cell-painting/issues/7#issuecomment-727981591 and we may choose to exclude well that are more than 1 s.d. (across all the 384 wells)
@mtegtmey @raldanehme:
@ruifanp and I met today; here's the summary
Our primary concern right now is that although we see a very good separation between case and control – reported using both t-tests and logistic regression – we noticed that cell count is a major driver of the difference.
If cell count is definitely a driver, then we should consider repeating this experiment, but we want to be doubly sure.
Here's @ruifanp 's plan
After penning this down, I wondered whether at all it is practical to expect that cell count can be controlled any further @mtegtmey? We still don't know if cell count is the driver, so we may not need to repeat it after all. But will be good to know what's possible.
I redid logistic regression doing the split based on patient number rather than a purely random train test split. Here's the results.
Note that the scores tend to be pretty unstable depending on the random state used for the splitting, so I ran with several random states and chose a representative score. It appears that limiting results to ones with over 200 cell count only improves the quality of the data and leads to better separation of the classes.
This is encouraging! (but IIRC you said you expected the 200 threshold not to change this dramatically)
Looking forward to seeing what happens when you regress out cell count
This is the distribution of normalized cell count for control and deletions. The deletions are shifted a bit to the right, which is a bit surprising since the controls were plated first. It seems that cell count is very weakly correlated at best with the control or deletion. The fact that there is a bit of a bimodal distribution is a bit interesting and I'll dig a bit deeper into this.
Below are the features with the highest correlation to the cell count:
I regressed out cell count by taking the residual of each data point with respect to the cell count.
This is with a random train test split:
This is with splitting based on patient number.
Note that splitting on patient number is not reliable. There are incredible variations in the score based on the random seed used to do the splitting. For example, the range of scores for the F/T/T instance literally went from 0.25 to 0.95 based on luck. I ran each condition several times and tried to pick a representative score.
Looking at the random split table above we can see that there is not much difference in the scores when using the residuals of the data, suggesting the cell count is not a huge driving factor, when there actually are cells and they're imaged correctly. This fits with the cell count distribution plot above. However, presence of low cell count wells add huge amounts of noise and faulty data, as evidenced by how the score improves when the low cell count (<100) wells are thrown out.
Thanks for the systematic analysis here!
We are primarily interested in results based on splitting by patient (i.e. donor) id, but it's good to have the random split handy.
Also, given the uncertainty around isogenics, I'll focus on humans only.
And given the issues with wells with low cell count, I'll focus on these case where you exclude low cell count.
So that brings us down to the first row T/T/na, where you get a score of 0.91 / 0.83 (without and with regressing out cell count).
So focusing on that alone,
@ruifanp got it
(Recap for me that this is the composition https://github.com/broadinstitute/neuronal-cell-painting#dataset-summary)
So for n = 22 per class, 70:30 is about 15:7 per class (x 8 or so replicates, although not all will have 8 replicates because you throw out some because of cell count).
I wonder how much of the variation in test accuracy is driven by the fact that we drop replicates for some donors. But it can't be a whole lot. Hm.
Can you do the following:
If we find that the bouncing around is just because of number of data points, I think we're ok and we needn't have @mtegtmey repeat the experiment. We still need to figure out what's driving the differences, but we will be reasonably satisfied that there's a difference and that any variation we see can be explained.
This is the result on 100 runs with different splits.
Stem
Progenitors
Splitting by cell line instead of random results in significantly lower accuracy and higher standard deviation. Over 100 runs we only get about 2/3 accuracy on average.
The csv files corresponding to the results: patient split, random split
I wanted to see if there was any way we can plot visually that the controls and deletions have differences. Below are the force directed graphs for the stem and progenitors, with the clusters labelled based on the cell line number.
Stem
You can see clear separation between human and isogenics and within the human samples, the lower numbers (controls) also somewhat separate from the higher numbers (deletions).
Progenitors
The numbers on the right show how many wells (out of 8) were removed due to having bad cell count. In this case, there isn't any visible separation between the classes we're looking for.
Would you guys have 15-30 min for a huddle this week? I want to make sure I understand the data so far and make a concrete decision about the next steps (repeat or no).
Sent from my iPhone
On Aug 16, 2021, at 1:38 PM, Ruifan Pei @.***> wrote:
This is the result on 100 runs with different splits.
Splitting by cell line instead of random results in significantly lower accuracy and higher standard deviation. Over 100 runs we only get about 2/3 accuracy on average.
The csv files corresponding to the results: patient split, random split
I wanted to see if there was any way we can plot visually that the controls and deletions have differences. Below are the force directed graphs for the stem and progenitors, with the clusters labelled based on the cell line number.
Stem
You can see clear separation between human and isogenics and within the human samples, the lower numbers (controls) also somewhat separate from the higher numbers (deletions).
Progenitors
The numbers on the right show how many wells (out of 8) were removed due to having bad cell count. In this case, there isn't any visible separation between the classes we're looking for.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@shntnu @ruifanp repeat plate for the NPCs will be imaged tomorrow! Any specific place you'd like me to transfer the images once they're finished?
I am copying it now
aws s3 sync --dryrun /imaging/analysis/stanley/nehme_lab/cellpainting/22q11.2_NPC_8.31.21/BR00127194__2021-09-03T17_06_45-Measurement_2 s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images/BR00127194__2021-09-03T17_06_45-Measurement_2
I had to log in to an interactive node to do this; can't do from login node
@bethac07 This is all set now
s3://imaging-platform/projects/2019_05_28_Neuronal_Cell_Painting/NCP_PROGENITORS_1/images/BR00127194__2021-09-03T17_06_45-Measurement_2
I'm not sure why it barfed twice, but looks good to go.
I believe Pearl's pipelines are here https://imaging-platform.s3.us-east-1.amazonaws.com/projects/2019_05_28_Neuronal_Cell_Painting/workspace/pipelines/NCP_PROGENITORS_1
We want to run both analysis pipelines
Please LMK if there's anything else you need.
Do you WANT the branch analysis run separately or just folded into the larger analysis?
If possible a separate run would be ideal, so we could peek at that data sooner. But if you feel like it makes more sense to just bring it into the larger analysis by all means go ahead that way!
On Sep 16, 2021, at 2:37 PM, Beth Cimini @.***> wrote:
Do you WANT the branch analysis run separately or just folded into the larger analysis?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/neuronal-cell-painting/issues/10#issuecomment-921148651, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMSE5EXR35T3ZP5OY4CWZ2LUCI2ODANCNFSM4UMKM2GA.
@mtegtmey @bethac07 IIUC the hands-on time for folding into the larger analysis will take no longer than branch analysis, so let's go with together
So @rsenft1 and I were running this second batch through and one thing we noticed is that the cell boundaries don't follow all the way out to the small dim processes - confirmed that it seems the same was true in the first batch (see screenshot below from NCP_PROGENITORS_1/O-05 site 3). Is this the desired behavior? For this second batch, would you want us to a) make it most close to the results of the last batch or b) make it follow all these processes out? We can design a pipeline either way but wanted to get your guys thoughts on it.
It would be great if it could follow all the processes out- that is exactly what we want to measure. Thanks for flagging this!
On Fri, Sep 17, 2021 at 11:36 AM Beth Cimini @.***> wrote:
So @rsenft1 https://github.com/rsenft1 and I were running this second batch through and one thing we noticed is that the cell boundaries don't follow all the way out to the small dim processes - confirmed that it seems the same was true in the first batch (see screenshot below from NCP_PROGENITORS_1/O-05 site 3). Is this the desired behavior? For this second batch, would you want us to a) make it most close to the results of the last batch or b) make it follow all these processes out? We can design a pipeline either way but wanted to get your guys thoughts on it.
[image: image] https://user-images.githubusercontent.com/6721515/133814942-10be15ee-e13e-4a2d-b78c-5562998b1ff9.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/neuronal-cell-painting/issues/10#issuecomment-921892948, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJX7ZO3WQTE3F5QSJV5ZPYDUCNOBLANCNFSM4UMKM2GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Ok, can do.
Do we need to rerun the first batch? Otherwise you may get different results from that batch and this one- sorry, I haven't been in the loop enough to know whether this is intended to supplement or replace the plate from December.
No need to run the first batch, this was a redo!
Sent from my iPhone
On Sep 17, 2021, at 11:40 AM, raldanehme @.***> wrote:
It would be great if it could follow all the processes out- that is exactly what we want to measure. Thanks for flagging this!
On Fri, Sep 17, 2021 at 11:36 AM Beth Cimini @.***> wrote:
So @rsenft1 https://github.com/rsenft1 and I were running this second batch through and one thing we noticed is that the cell boundaries don't follow all the way out to the small dim processes - confirmed that it seems the same was true in the first batch (see screenshot below from NCP_PROGENITORS_1/O-05 site 3). Is this the desired behavior? For this second batch, would you want us to a) make it most close to the results of the last batch or b) make it follow all these processes out? We can design a pipeline either way but wanted to get your guys thoughts on it.
[image: image] https://user-images.githubusercontent.com/6721515/133814942-10be15ee-e13e-4a2d-b78c-5562998b1ff9.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/neuronal-cell-painting/issues/10#issuecomment-921892948, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJX7ZO3WQTE3F5QSJV5ZPYDUCNOBLANCNFSM4UMKM2GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
With the new settings, this is a more representative field of what we're seeing - green here is actin, magenta is DNA. Does this look like what you were expecting/hoping for for this cell type? If so, I can pull the trigger on analysis today or Monday.
Goal
Perform Cell Painting on neural progenitor cells to delineate morphological traits which separate patients and controls during early forebrain development
Experimental Design
Expected date for imaging: Done Dyes: Cell Painting dyes Cell type: Day 4 progenitors Plates: 1 x 384-well Plate layout: this will be identical to the layout used for the cmQTL project, consisting of 48 different lines segmented into 4-well blocks dispersed across the 384-well plate. Plating parameters: 15k cells/well, fixed 24hrs post-plating (identified in our pilot)
Proposed analysis:
Metadata