AllenInstitute / ophys_nway_matching

Other
4 stars 0 forks source link

Registration follow-up #45

Closed djkapner closed 3 years ago

djkapner commented 4 years ago

In the last few months, we deployed an update to nway cell matching that improved registration. At that time, there were still some known registration failures, but, there was enough uncertainty in other possible QC issues, and a large number of improvements, that we just deployed the change.

In discussion with @matchings it is time to look more closely at some of these cases and come up with a strategy for handling them.

Tasks:

Bonus:

Validation:

djkapner commented 4 years ago

container_1018028254.png

djkapner commented 3 years ago

@matchings I have made what I hope is a more robust meta-registration. It tries a few different alignment strategies and then chooses the best one based on a structural simularity metric. Different image pairs end up using different strategies for best alignment, which is consistent with my struggle to make a one-size-fits-all single strategy.This seems to have eliminated many of the cases where obvious double-cells are visible in projection overlays after registration.

I selected 96 containers by this query:

SELECT vebc.id as container_id, (wkf.storage_directory || wkf.filename) as nway_output
FROM visual_behavior_experiment_containers as vebc
JOIN projects on projects.id=vebc.project_id
JOIN visual_behavior_container_runs as vbcr on vbcr.visual_behavior_experiment_container_id=vebc.id
JOIN well_known_files as wkf on wkf.attachable_id=vbcr.id
JOIN well_known_file_types as wkft on wkft.id=wkf.well_known_file_type_id
WHERE vebc.workflow_state in ('completed')
AND vbcr.current=true
AND wkft.name='OphysNwayCellMatchingOutput'
AND projects.code in ('VisualBehavior','VisualBehaviorTask1B','VisualBehaviorMultiscope','VisualBehaviorMultiscope4areasx2d')

and then ran the meta-registration-based nway matching on these containers. The results are available here:

/allen/aibs/informatics/danielk/nway_test/compare_registration2

there is a subdirectory for each of the 96 containers, and each one contains an output.json and a pdf plot I used to manually check the overlays (B/W, I don't have the same R/G channel overlay you have). All the pdfs are also concatenated to a rather largs pdf:

/allen/aibs/informatics/danielk/nway_test/compare_registration2/summary.pdf

My by-container annotation of the results are here: https://alleninstitute.sharepoint.com/:x:/s/Pika/ERS8py3Y8KpPilbIUFGvGyUB7937DEu9qX1a_KGGyGv-6g?e=BI6ZyJ

Of the 96 containers, I think there are 13 where something is not quite right. In some cases, there is one experiment that is hard for me to even see how it aligns. In others, the projections themselves are pretty blurry.

If you can, it would be great to:

NB: once we agree this is looking good, it will likely take a day or two to clean up the code and PR/merge it. It is a pretty big change to the codebase.

matchings commented 3 years ago

@djkapner These are appropriate containers to look at, they are all in our current set being used for analysis.

Its hard for me to evaluate the plots with the B/W overlay - is it a huge pain to generate these same plots with R/G overlay instead?

Still, from what I can tell, things look pretty good with this strategy of choosing the best reg method based on improvement in SSIM. I had been hesitant to use SSIM on the registered output as a QC metric because the absolute value is pretty variable across cre lines / imaging conditions, but using the difference in SSIM between the per-registration pairs and post-registration pairs is smart. Even challenging cases appear to have registered.

I would like to run my own code to get the same plots I always generate, but unfortunately all of my validation code is built around the content of the 'OphysNwayCellMatchingStrategy' folder on lims for each container, including loading the registered images directly, and it would take me some time to reformat it to work with just the output.json file provided in the off-pipeline results you shared.

Alternatively, we could just deploy this fix to pipeline if we are reasonably confident that it is an improvement (from what I can glean from the B/W overlay, it looks reasonable, but its hard to be super confident without the R/G plots that i'm used to looking at). If you deploy it, I can run all my normal plotting code on it and do another round of verification. But, you said it's a few days of work to implement and I don't want you to have to do that work unless we think this is a good way to go. Personally, i'm comfortable with going forward with it from what I have seen, and it's not going to disrupt anyone's analysis or anything because no one is using the matching output currently (since we are working on it). But I will leave it up to you.

djkapner commented 3 years ago

~restarted~ reran matching on these containers, preserving all the ouputs

djkapner commented 3 years ago

Thinking about this one: 1018028034 agree -> pair with moving expt 965267337 looks off to me it seems like even now that it is more robust, we can still find some places where it isn't succeeding. I see 2 strategies we can pursue (or both): 1) add another candidate transform series. Right now, the meta registration tries 5 different registration strategies (1 of them being just an identity transform):

                tf.TransformList(transforms=[Identity()]),
                tf.TransformList(transforms=[crop(), ecc()]),
                tf.TransformList(transforms=[crop(), contrast(), ecc()]),
                tf.TransformList(transforms=[crop(), contrast(), phase()]),
                tf.TransformList(transforms=[crop(), contrast(), phase(), ecc()])

We can drill down on failures and when we find a success, keep adding to this list of candidate strategies. 2) we can add the ability to specify a transform manually. A human can work on the pair offline and then we can somehow provide the algorithm with a transform for that pair.

djkapner commented 3 years ago

@matchings Another note about the list of registration strategies above. I included "Identity" as a candidate such that, if it ends up being the best alignment (which is very unlikely) then we can count this as a failure to register. I am adding the best strategy to the output.json, like so:

  "pairwise_results": [
    {
      "fixed_experiment": 866518320,
      "transform": {
        "best_registration": [
          "Crop",
          "CLAHE",
          "PhaseCorrelate",
          "ECC"
        ],
        "best_score": 0.4637262349566246,
        "properties": {
          "scale": [
            0.9999999796521901,
            0.9999999796521901
          ],
          "shear": 0.0,
          "translation": [
            7.652287412690555,
            30.338251778075282
          ],
          "rotation": 0.0009554992332022883
        },
        "matrix": [
          [
            0.9999995231628418,
            -0.0009554990683682263,
            7.652287412690555
          ],
          [
            0.0009554990683682263,
            0.9999995231628418,
            30.338251778075282
          ],
          [
            0.0,
            0.0,
            1.0
          ]
        ]

In the case: "best_registration": ["Identity"]we can either raise an exception and have the whole job fail (no container results available until fixed) or just have the ouput as noted and some post-process inspection would look for "best_registration": ["Identity"] (partial container results available)

matchings commented 3 years ago

having the job fail and not provide results until fixed is probably better as it will prevent people from accidentally doing analysis on incorrect outputs

matchings commented 3 years ago

regarding strategies for problem experiments, i think it would be good to identify just how many problem cases / edge cases there are before investing additional work. it would be great if we had a metric for this. the cases returning "Identity" could be identified and evaluated. We could also use the fraction of matched cells to identify potential problem cases and then do a visual inspection on those (i.e. if the fraction matched cells is lower than some expected threshold, we flag those cases for visual review). We are planning to have a QCathon in a few weeks (likely first week of December) and this will be an opportunity to get a lot more eyes on data. If that is too far away, I can try to do some of this work myself sooner.

To this end, would it be possible to save the number and fraction matched cells per registered pair in the output json? Of course we will also want to consider these same values after filtering for valid/invalid cells, so another step will be needed, but having the raw numbers for the full matching output would be a useful first pass.

djkapner commented 3 years ago

For these containers, there were 1976 image pairs registered. The selected methods for best registration accounted for these with this breakdown:

{
  "Crop CLAHE ECC": 643,
  "Crop ECC": 505,
  "Crop CLAHE PhaseCorrelate": 619,
  "Crop CLAHE PhaseCorrelate ECC": 203,
  "Identity": 6
}

The six identity ones were:

container fixed_experiment moving_experiment
1018027781 992084625 1001576099
1018027781 993369858 1000744365
1018027781 993369858 1001576099
1018027781 993619366 1001576099
1018027781 994082673 1000744365
1018027781 994082673 1001576099

In the annotation spreadsheet, I had noted this container as "projections are blurry". @matchings can you take a look at that container specifically and mull whether having that fail is the desired outcome?

matchings commented 3 years ago

Yeah that container looks pretty crappy. I don't think its worth pursing a solution for it, I will just make a note that it should fail QC and not be included in the release. Here are the average images and max projection.

container_1018027781

container_1018027781

matchings commented 3 years ago

@djkapner I compared the original registration overlay images with ones using the new results from your test fix in 'compare_registration2', and I can safely say that the new version is a major improvement from the previous one. In all the cases I looked at, the registration was much cleaner. This was true for the clearly problematic ones (with a double cell appearance in the overlay) and even for the ones that already looked pretty good with the original version, the new one cleaned it up. I compiled some examples into an excel file that I can send you independently.

Here's my official thumbs up to implement this in production. It looks great.

matchings commented 3 years ago

Code to generate registration overlay plots: https://github.com/AllenInstitute/visual_behavior_analysis/blob/dev/notebooks/201109_cell_matching_qc_plots.ipynb

djkapner commented 3 years ago

closing the loop on some of the container annotations I had made where the registration did not seem great (seep spreadsheet link). From Marina:

ok i looked through them and the ones you noted either looked fine to me (as in the registration appears to have worked to the best of its ability) or had data problems and it wasnt surprising that it didnt register well

this means, in all cases of expected success, it is succeeding. We're proceeding with the rollout.

djkapner commented 3 years ago

@matchings merged and deployed!

now either I or @wbwakeman need to rerun the VB containers. Anyone remember where this list is?

wbwakeman commented 3 years ago

@djkapner @matchings I will restart these jobs for all containers that could be destined for the release. I may also take the opportunity to do a refresh on cell_specimen_ids too (recognizing that these will still change when we do a final run prior to release).

matchings commented 3 years ago

Awesome, thanks @wbwakeman and @djkapner !!!