Closed shntnu closed 3 years ago
I have uploaded sample images here (13Gb)
But you may want to resample to get more / different images.
@jatinarora-upmc The file data/gwas_images.csv
generated by 7.select_images_to_print.md has the information you need about cell lines
In #36 I updated the code so that all this information is readily available in two CSV files. E.g. In a row from sample_images.csv
Metadata_Plate | Metadata_Well | Metadata_Channel | filename | URL |
---|---|---|---|---|
cmqtlpl1.5-31-2019-mt | A12 | URL_OrigDNA | r01c12f05p01-ch5sk1fk1fl1.tiff | https://s3.amazonaws.com/imaging-platform/projects/2018_06_05_cmQTL/2019_06_10_Batch3/images/cmqtlpl1.5-31-2019-mt__2019-06-10T16_42_36-Measurement2/Images/r01c12f05p01-ch5sk1fk1fl1.tiff |
we see that the file cmqtlpl1.5-31-2019-mt/r01c12f05p01-ch5sk1fk1fl1.tiff
comes from plate cmqtlpl1.5-31-2019-mt
and well A12
.
We can join with sample_images_metadata.csv
to figure the metadata corresponding to plate cmqtlpl1.5-31-2019-mt
and well A12
:
Metadata_Plate | Metadata_Well | Metadata_Row | Metadata_FieldID | Metadata_Assay_Plate_Barcode | Metadata_Plate_Map_Name | Metadata_well_position | Metadata_plating_density | Metadata_line_ID |
---|---|---|---|---|---|---|---|---|
cmqtlpl1.5-31-2019-mt | A12 | 1 | 5 | cmqtlpl1.5-31-2019-mt | cmQTL_plate1_5.31.2019 | A12 | 10000 | 34 |
specifically, that the cell line id is 34
.
I guess there is an issue here. There are many image identifiers which map to more than 1 cell lines. For example, the image r14c11f05p01-ch5sk1fk1fl1.tiff maps to two cell lines on two plates (98 on BR00106709, and 236 on BR00107338). Could you check, or am I looking in wrong way?
The file names are identical across all plates.
Use Metadata_Plate as well as filename to identify
Does that help?
Sent from my iPhone
On Apr 2, 2020, at 2:43 PM, Jatin Arora notifications@github.com wrote:
I guess there is an issue here. There are many image identifiers which map to more than 1 cell lines. For example, the image r14c11f05p01-ch5sk1fk1fl1.tiff maps to two cell lines on two plates (98 on BR00106709, and 236 on BR00107338). Could you check, or am I looking in wrong way?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
@jatinarora-upmc I'm following up on your Slack message here. Can you review this thread and LMK if you are able to figure out how to get example images for a cell line ID?
@shntnu thanks much for reminding me of this thread. Is the file sample_images_metadata.csv for all images across all plates (for which the link you put in second message here (13g))?
The notebook referred to in https://github.com/broadinstitute/cmQTL/issues/35#issue-591953618 was used to produce sample_images_metadata.csv
and sample_images.csv
. It samples one well per cell line and then a single, fixed field-of-view a.k.a. site from each well (it always picks Metadata_Site
= 5)
Thanks @shntnu . Would it be possible to generate another set of images with another Metadata_Site, let's say 3? I guess the images are stored at your side, so you might have to re-run the script?
Now available via #45
@jatinarora-upmc this notebook has details on how to download; I am copying it below. The sample_images.csv
file referred to below is produced by that notebook.
IMAGE_DIR=/tmp/cmqtl
mkdir -p $IMAGE_DIR
cut -d"," -f1 data/sample_images.csv | grep -v Metadata_Plate| sort -u > /tmp/plates.txt
parallel -a /tmp/plates.txt --no-run-if-empty mkdir -p $IMAGE_DIR/{}
parallel \
--header ".*\n" \
-C "," \
-a data/sample_images.csv \
--eta \
--joblog ${IMAGE_DIR}/download.log \
wget -q -O ${IMAGE_DIR}/{1}/{4} {5}
@shntnu i saw this code previously, but i could not figure out where new sample_images.csv file is. #45 redirects me to #35, and i got lost in circle. Sorry to bug again, am not used to github at all - so i think a direct link to sample_images.csv would be so helpful.
Sure thing. This is the file https://github.com/broadinstitute/cmQTL/blob/bcef95625d964d10ad8d81e31b453ab21f09f969/1.profile-cell-lines/data/sample_images.csv
In that snippet, I had a relative path to it data/sample_images.csv
because the notebook is in the folder 1.profile-cell-lines/
.
all set now, thanks so much @shntnu
@shntnu i am going through the comments i got in today's meeting. May i ask for the images for plate7 also?
I have updated sample_images.csv
to include the new version of plate 7
https://github.com/broadinstitute/cmQTL/blob/master/1.profile-cell-lines/data/sample_images.csv
@jatinarora-upmc see https://github.com/broadinstitute/cmQTL/issues/35#issuecomment-658963662 for what to do next (everything is the same as before, just that I have now replaced with the new plate 7 images)
@shntnu it seems there are fewer lines in this updated samples_images.csv, and there is no plate cmQTLplate7-7-22-20 anywhere in Metadata_Plate column. Could you please check?
@jatinarora-upmc Now fixed in #60, which updated https://github.com/broadinstitute/cmQTL/blob/master/1.profile-cell-lines/data/sample_images.csv
@jatinarora-upmc @sasgari The notebook
1.profile-cell-lines/7.select_images_to_print.Rmd
shows how to download sample images. Have a look and LMK if you have any questions.