Changing Tag Flag/Fail status

samiamseid commented 1 year ago

Metric / Plot Impacted

Data Stream Timing and Sync (Automated) (Stimulus Signal, Long Frames)
Session Experiments FOV Integrity (extreme cross-talk present)
Field of View Matching (FOV does not match parent FOV)

Update Motivation A few tags dont accurately reflect the QC verdict for our current projects. Some flag when no flag is necessary, and others flag when they should mark the datastream as failed. Updating these will help make the QC report summary and the QC status section accurately reflect the actual QC of sessions and experiments.

Requested Feature/Solution General description: Changing level of flags

Data Stream Timing and Sync (Automated) (Stimulus Signal, Long Frames) This metric flags for nearly ever session, since the threshold is currently set to 1 frame. This means the flag gets ignored. If we moved the flag threshold up to something like 20 frames, it would identify when there is an abnormal increase in long frames, but not a failworthy amount (which is 60).
Session Experiments FOV Integrity (extreme cross-talk present) Until there is some post-processing QC process that can validate the quality of an imaging plane after de-crosstalk has occurred, this should fail a session, as it is a failure to properly set up the experiment.
Field of View Matching (FOV does not match parent FOV) This should mark a session as failed for every project except for VIPaxonal, as cell matching is a requirement for an experiment to be valid for a container. If someday there is an automatic method for separating experiments that fail cell matching into their own containers we could revisit this, but for now marking an experiment as "Failed" is the only way of making sure it is not included in analysis for a container.

Scope all projects

DowntonCrabby commented 1 year ago

I am going to hold off on this work until we've had a chance to discuss it with the scientists a bit more. I think this is a perfect topic for the Monday ops meeting. Sean and I can then implement whatever comes out of that meeting.

I am going to reassign the ticket to @nataliaorlova @matchings and @pgroblewski to weigh in.

matchings commented 1 year ago

A few questions for @samiamseid and my perspective on some of the issues:

1) for Data Stream Timing and Sync, is the flag you are talking about just for long frames or also for dropped frames? I am fine with changing the threshold for long frames to be greater than 1 (maybe 10?), but the threshold for dropped frames should be 1, since any mismatch in frame # across data streams compromises our ability to align them.

2) for the “extreme crosstalk” flag, you said “this should fail a session, as it is a failure to properly set up the experiment” - can you clarify this a bit? What about the experiment setup can fail and lead to crosstalk? I was under the impression that we do not yet understand the cause of the extreme crosstalk, so if there is something about the experimental procedure that influences whether it occurs or not, that would be super useful to know!

Also on this point, how are you checking for crosstalk as part of the QC process?

3) I fully agree that “FOV does not match parent FOV” should always fail (except for axonal project or any that explicitly aren’t trying to match). I’m surprised that wasn’t already the case.

DowntonCrabby commented 1 year ago

@matchings a note on 3- we had at some meeting determined that these shouldn't fail but should be a flag and removed from the container. That way the data can still be released it just isn't associated with that specific container.

However, I agree with @samiamseid that since we do not currently have the infrastructure in place to automatically kick the experiment out of the container if this flag is present so until that infrastructure is in place we can switch to failing these.

matchings commented 1 year ago

Oh right, I do remember that meeting. And I agree with what you said, until we have infrastructure to removed things from containers based on flags, we should keep failing them as we have been. But this means that we need a mechanism to “rescue” experiments that failed for FOV matching (and only FOV matching) if we want to release them. I think this should be up to the relevant project leads, based on the goals of their project, and would need to be considered at the time of release.

samiamseid commented 1 year ago

This is for long frames (interframe intervals >0.025s), specifically for the stimulus timing. Even setting the flag threshold to 10 would stop this metric from flagging every session, since its usually between 1 to 4 long frames
When the operator is setting up the session, there is a step where we have to align the laser pulse timing and the demultiplexing. Without getting too in-depth about it, there are parts of crosstalk we can control, and parts that we cant control. I suppose we shouldn't always assume that severe crosstalk is due to a procedure error, but there does need to be a fail level flag for crosstalk when we identify a problem using that module.

Identifying crosstalk involves comparing the signal within paired planes. Each plane should have unique signal. If both planes look the same, then something went wrong and the data should fail.

matchings commented 1 year ago

I am all for failing sessions when two paired planes look the same like in the image you shared. And for increasing long frames threshold to 10 (kind of arbitrary but that’s ok).

I would definitely like to learn more about the step where you align the laser pulse timing and the demultiplexing, that seems super important. I wonder whether something about this process changed around the same time that we started seeing more severe crosstalk in the data.

I think @jkim0731 might be able to provide a date estimate of when we started seeing more severe crosstalk (seems to be correlated with the appearance of nonrigid motion as well), and maybe @nataliaorlova could give some insight as to whether anything about the rig or setup process may have changed in the same time frame.

The cause of decrosstalk is a separate conversation though so we should address this in a separate ticket.

I think we are in agreement about

Increasing threshold for long frames
Failing when crosstalk is so bad that the paired planes look nearly identical
Failing for FOV matching (and coming up with a way to rescue these if there were no other QC issues and the project leads want to keep it)

DowntonCrabby commented 1 year ago

[x] increase visual stim long frame threshold to greater than/equal to 10
[x] change qc outcome from flag to fail for paired_planes_cross-talk controlled language tag
[x] change qc outcome from flag to fail for FOV does not match parent FOV controlled language tag

AllenInstitute / brain_observatory_qc

Changing Tag Flag/Fail status #263