Closed wongdanr closed 2 years ago
@gwaygenomics Has someone (possibly past me) checked that the stain to numeric ID mapping you have there is true for every plate in every batch? Because those assignments can and do change.
I don't recall if anyone has checked for consistency! Thanks for noting here.
Would you say that the xml files is the source of truth?
Yes, the xml files are the source of truth
Sorry @bethac07 so does this channel mapping not apply to all LINCS images? ch01 - HOECHST 33342 DNA ch02 - Alexa 488 ER ch03 - 488 long RNA ch04 - Alexa 568 AGP ch05 - Alexa 647 Mito
That mapping may or may not, I don't know. I have no evidence that it does not, but I do know we have occasionally had cases where within batches (and certainly across them) the channel order changes between plates. It doesn't sound like anyone in anything documented has ever confirmed if this is consistent.
oh I see ok thanks for letting me know. Will those xml files you reference have the channel mappings per batch?
The xml files have the mappings, but as I stated, I would not trust them at a batch level, only at a plate level.
Thanks @bethac07, I just parsed through all of the xml files and confirmed that indeed the mapping that @gwaygenomics gave holds for all plates (at least the ones in s3://cellpainting-gallery/lincs/broad/images/2016_04_01_a549_48hr_batch1/images/ xml/). Thank you!
Wonderful! Thanks for following up with this @wongdanr
If possible, can you paste the code snippet you used to confirm? (It is likely to help future users who stumble upon this issue)
sure thing @gwaybio:
def testChannelOrderThroughLINCSXMLFiles():
"""
Make sure that our lincs channel image assumption is correct:
{1: "HOECHST 33342", 2:"Alexa 488", 3:"488 long", 4:"Alexa 568", 5:"Alexa 647"}
Parses through xml files in /home/wongd26/workspace/profiler/lincs_xml_files/xml/
(aws s3 cp --recursive --no-sign-request s3://cellpainting-gallery/lincs/broad/images/2016_04_01_a549_48hr_batch1/images/ xml/ --exclude "*" --include "*.xml")
"""
directory = "/home/wongd26/workspace/profiler/lincs_xml_files/xml/"
channel_map = {1: "HOECHST 33342", 2:"Alexa 488", 3:"488 long", 4:"Alexa 568", 5:"Alexa 647"}
for sub_dir in os.listdir(directory):
print("subdir", sub_dir)
with open(directory + sub_dir + "/Images/Index.idx.xml", "r") as xmlfile:
lines = xmlfile.readlines()
for line in lines:
if "<ChannelID>" in line:
channel = int(line[line.find("<ChannelID>") + 11: line.rfind("<")])
if "<ChannelName>" in line:
name = line[line.find("<ChannelName>") + 13: line.rfind("<")]
assert(channel_map[channel] == name)
awesome. thanks @wongdanr
Hello! Where can I find which markers/stains each of the 5 image channels correspond to?