Open hanslovsky opened 11 months ago
I will label those as JCP2022_NAN
for my own record keeping so I can easily exclude them.
Thanks @hanslovsky for flagging this. QC issues could be the reason.
@shntnu were wells not included in wells.csv.gz
because of QC issues?
I join the metadata from
load_data_with_illum.parquet
files with the data inwell.csv.gz
to download images for a plate and also get the associated perturbations. I noticed that there are no labels inwell.csv.gz
for some of theCOMPOUND_EMPTY
wells inload_data_with_illum.parquet
for plateUL001661
insource_1
. Note:jc.MetadataFiles.{get_well,get_plate}
are convenience functions to read the metadata files at commit 4b24577c2d3228d92177b807fa53fbbc623da1cb. This is not the most recent commit on main and I will double check with the most recent commit on main, too.At first, I thought this may be blank images as described in https://github.com/jump-cellpainting/datasets/issues/61#issuecomment-1499094909 but plate
UL001661
is not listed in that comment. I downloaded one dna channel image for the wells that I identified fromI found that the image is not blank but it is very noisy and with strong artifacts plus visible well edge:
Did these wells not pass QA and should be excluded, and are thus not included in the metadata? Can I extrapolate that to any other well that is not available in
well.csv.gz
?Thank you!