jump-cellpainting / datasets

Images and other data from the JUMP Cell Painting Consortium
BSD 3-Clause "New" or "Revised" License
155 stars 16 forks source link

Unable to replicate CellProfiler pipeline output #92

Closed gareth-rogers-healx closed 8 months ago

gareth-rogers-healx commented 8 months ago

I am attempting to use JUMP_production/JUMP_analysis_v3.cppipe to replicate the output for site: source_2, batch: 20210614_Batch_1 and plate: 1053600674 found in cpg0016-jump/source_2/workspace/analysis/20210614_Batch_1/1053600674/analysis/1053600674/.

However, when I run the pipeline the IdentifyPrimaryObjects module find only a couple of hundred objects across the whole plate.

My setup is as follows:

I modified the load data CSV to contain brightfield images, I propagated the following pattern:

Where I updated the well coordinate to be appropriate for the row. I don't know if these are correct however, I don't believe they are causing the issue I'm seeing. This relates to datasets/issues/79.

I modified the load data CSV to use my local paths rather than S3. This speeds up local tests however, I have run against S3 with the same results. Basically no objects found.

I believe in this is related to the illumination correction files. I have run JUMP_production/JUMP_QC_LoadData_v1.cppipe and that finds about 1/3 of the objects compared to the published results. As this generates it's own illumination correction files I did a quick test with the V3 pipeline using the OrigDNA rather than CorrBlue for the IdentifyPrimaryObjects module then the V3 pipeline finds a similar number of objects to the QC pipeline.

I have tried running JUMP_production/JUMP_illum_LoadData_v1.cppipe to generate the illumination correction files locally however, the produced files are the wrong shape. I get the error:

Error while processing CorrectIlluminationApply: This module requires that the image and illumination function have equal dimensions. The OrigDNA image and IllumDNA illumination function do not ((996, 996) vs (995, 995)). If they are paired correctly you may want to use the Resize or Crop module to make them the same size.

I ran that pipeline with my modified load data file (without illum columns).

Our intention is to use the pipeline for our own data and the first step was to replicate the JUMP results to gain confidence with the pipeline and CellProfiler.

Do you have any advice on how to proceed?

gareth-rogers-healx commented 8 months ago

These questions have been answered in a CellProfiler forum post here.