AllenInstitute / ipfx

computes intrinsic cell features from intracellular electrophysiology data
https://ipfx.readthedocs.io/en/latest/
Other
26 stars 36 forks source link

Cells failing QC due to "electrode_0_pa missing value" #509

Open ru57y34nn opened 3 years ago

ru57y34nn commented 3 years ago

Describe the use case that is addressed by this feature. There are currently nearly 700 IVSCC pipeline experiments that are failing cell QC with the fail tag "electrode_0_pa missing value". This fail tag is referring to one of our initial voltage clamp QC sweeps that IPFX is unable to find (stimulus name: ‘EXTPINBATH’) because the incorrect sweep was being ran at this point (stimulus name: ‘EXTPSMOKET’). This is actually not that big of a deal because these two stimulus waves are actually the same wave, but with different names to make it easy to distinguish what point we are at during the experiment simply by stimulus name, which is how IPFX identifies these sweeps.

Describe the solution you'd like If the ‘EXTPINBATH’ sweep is not found then it needs to look for the ‘EXTPSMOKET’ sweeps and do a quick resistance calculation to ensure that the rig user actually had the pipette in the bath and then measure the baseline electrode current and compare it to the qc values to ensure that it is in range. This can be accomplished by adding a new function to qc_feature_extractor.py that will look for the last 'EXTPSMOKET' sweep before the 'EXTPCllATT' sweep and if one exists, then it should check the resistance by calling 'measure_seal' from qc_features.py, and if the seal is less than 10 MOhms then call 'measure_electrode_0' from 'qc_features.py'. This new function can be called by 'extract_electrode_0' in the try, except block when the 'EXTPINBATH' sweep is not found.

Describe alternatives you've considered The only other alternative solution would be to manually QC all of these cells, and we would like to avoid that if possible.

Additional context This is a high priority issue because this issue is affecting several hundred IVSCC pipeline experiments, all of which are failing QC due to the issue

Do you want to work on this issue? I am already working on a fix for this issue and will submit a pull request as soon as I am done testing.

wbwakeman commented 3 years ago

Hi @ru57y34nn I think the Marmot team can fix this if you can help us figure out what is the set of affected data. Can you please provide a list of experiments, or another way to identify the data that has this issue? Thank you

ru57y34nn commented 3 years ago

Hi @wbwakeman Yeah I can get you a list of all of the affected data, but I am curious what you and/or the Marmot team have in mind for a fix as we would like to avoid altering or duplicating the data files. Also, I have already submitted a pull request with a fix (Pull request #510, which addresses this issue) and I just wanted to make sure that you have already looked at that. Sergey had raised a couple issues in the comments of the pull request but I believe I addressed them and have been waiting on a reply. It sounds like Sergey is not completely opposed to merging the pull request but has stated that we would perhaps have to document these changes to IPFX since they deviate somewhat from the white paper, in which case I think we could do that.

wbwakeman commented 3 years ago

Hi @ru57y34nn The team and I have had a few conversations about this. We do not want to merge the PR because it introduces an additional code path to address a special case of the data that should not be encountered again. Some of our worst code and most painful bugs have been the result of making special-case modifications to the code base to account for issues like this. We are trying to learn from our past mistakes.

Our plan is the retain a copy of the original data files on the file system, but to create a new file with the data updated to be correct, and then process the updated file through the rest of the pipeline.

ru57y34nn commented 3 years ago

Okay thanks @wbwakeman, I see where you are coming from and that should be fine for a solution. I will get you the list of affected data but I was thinking that it might be helpful if I also included the sweep number of the EXTPSMOKET sweep that should be relabeled as EXTPINBATH for each experiment. Let me know if that would be helpful and I will include that in the list as well, thanks.

wbwakeman commented 3 years ago

Thanks @ru57y34nn That would be very helpful!

sgratiy commented 3 years ago
wbwakeman commented 3 years ago

@ru57y34nn We changed and ran one of these as an example. Can you please check this file and see if it looks okay?

/allen/programs/celltypes/production/mousecelltypes/prod2826/Ephys_Roi_Result_1048215014/1048215014_1119548823_spikes.nwb

ru57y34nn commented 3 years ago

@wbwakeman I checked the new nwb file and it now shows sweep 1 as the EXTPINBATH180424, where as this sweep was previously labeled as EXTPSMOKET180424. This definitely fixed the issue for this cell and I see that it has completed QC and feature extraction and is no longer failing due to "electrode_0_pa missing value". Looks good, thanks and let me know if you need anything else.

wbwakeman commented 3 years ago

Thanks @ru57y34nn . I will update and reprocess the remaining 500+ experiments in the list.

Thanks @JakeSAI for converting the files

wbwakeman commented 3 years ago

@ru57y34nn Looks like most of the 579 have completed processing. One is still running. 29 failed because stimulus_duration was null

ru57y34nn commented 3 years ago

Thanks @wbwakeman I will check these today. Do you happen to have the list of 29 that failed due to stimulus_duration was null? I'm sure I can track them down, I just figured if you had the list handy that would be easiest. Thanks

wbwakeman commented 3 years ago

@ru57y34nn

1051143316
1052510155
1055221318
1058212878
1059527503
1062296205
1062574935
1062768364
1062774362
1069242291
1069527000
1071658319
1071675546
1075277114
1075901296
1076113329
1076279677
1077698184
1077731028
1079007610
1079017033
1079070872
1079081376
1079091494
1079571190
1080300049
1080315249
1080372178
1085140564
ru57y34nn commented 3 years ago

@wbwakeman I have been going through and checking these and with the exception of the 29 that failed due to the stimulus_duration was null issue issue, everything else looks great. I am still looking into the null stimulusduration issue, but it seems very familiar to another issue we had back at the beginning of the year with the stimulusamplitude being null. I noticed that there are other experiments that contain sweeps with null stimulus_durations that are passing just fine; it seems that the null stimulus_duration is not a problem as long as the sweep is tagged in the STIMULUS_SUMMARY_V3_QUEUE json (usually with "recording stopped before completing the experiment epoch") and then removed before sweep QC. I have attached two images, one where the stimulus_duration is null but the sweep is not tagged, which causes the issue, and one where the stimulus_duration is null but the sweep is tagged, which does not cause the issue. I am still looking into why these sweeps did not get tagged even though it seems like they should have. I'll update you again one I have more information, thanks. stimulus_null_fail stimulus_null_okay