Open saskiad opened 5 years ago
I'd suggest adding columns for the ROI properties that are used in the filtering, rather than height and width. If we want users to be able include ROIs that we choose to fail, they should have the data used for that call.
TODO
@saskiad There is currently a column called "valid"; that is populated by LIMS. @wbwakeman do you know the criteria that sets this value?
@nicain There are many criteria used to set the value. ROIs are identified during different processing steps based on shape, size, motion correction, location in frame, overlap, and more. Jed made a fair summary of the processing at http://confluence.corp.alleninstitute.org/display/~jedp/Cell+ROI+Filtering .
In LIMS, we do record the reason for marking an ROI as invalid. There may be more than one reason for each cell_roi. We can expand this ticket to make that information available via the SDK.
What is in LIMS for ALL ophys experiments:
roi_exclusion_label | count
---------------------+--------
apical_dendrite | 151864
bad_shape | 70941
boundary | 96354
demix_error | 6314
low_signal | 282486
motion_border | 178009
small_size | 279097
union | 27878
| 940228
@saskiad would including a "roi_exclusion_label" column (like above), in addition to "valid" column, be enough to satisfy this issue?
I'm not sure I understand (I also can't get to the link at the moment). A column that provides the reason for the exclusion? What I think we need is the columns of the filtering step. ROI size, location, shape, etc with the relevant values. That allows users to set different thresholds.
On Mon, May 13, 2019 at 12:42 PM nicain notifications@github.com wrote:
@saskiad https://github.com/saskiad would including a "roi_exclusion_label" column (like above), in addition to "valid" column, be enough to satisfy this issue?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/AllenInstitute/AllenSDK/issues/632?email_source=notifications&email_token=AA5N4JSM3IVZG3SDR5X745LPVHAATA5CNFSM4HL53RX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVJLC2Q#issuecomment-491958634, or mute the thread https://github.com/notifications/unsubscribe-auth/AA5N4JX6XBP2CXGGUP5556DPVHAATANCNFSM4HL53RXQ .
Conceptually I am open to publishing the values that have been used to mark a cell_roi as valid/invalid. However, we need to put some thought into this. Here are some of my concerns:
John is planning to improve the segmentation code at some point in the future. The only reason it has not happened already is that improving the motion correction situation has taken much, much longer than he ever anticipated. Publishing this information now may restrict us from making desirable changes in the future, or at least bind us to some versioning system.
current metrics are difficult to understand. There is a reason we want to revamp that whole system
Because the segmentation code is not available, it is not straightforward how we got from the metrics to a valid/invalid call. Also a cell_roi can be marked invalid based on the output of modules other than segmentation.
@nicain The cell specimens table is SO EXTREMELY different from the item by the same name in vis coding. This could be called something else. Maybe roi table, which is more aligned with what it is. I personally find the max correction columns to be superfluous, these are the same for all rois in the experiment. Should be in metadata rather than per ROI.