spacetx / starfish

starfish: unified pipelines for image-based transcriptomics
https://spacetx-starfish.readthedocs.io/en/latest/
MIT License
226 stars 67 forks source link

duplicate spot_ids returned when using build_traces_sequential #1871

Closed mattcai closed 4 years ago

mattcai commented 4 years ago

Description

There is an expectation that each spot_id in an IntensityTable is a unique identifier for each feature. There is a bug when using build_traces_sequential, which is used by SimpleLookupDecoder and possibly PerRoundMaxChannel when running a FindSpotsAlgorithm (e.g. BlobDetector.run()) with no reference_image. In this case, spots are found in each image volume. So if there are 3 rounds and 3 channels in the experiment, spots are found in 9 image volumes. And for each image volume the spot_ids are indexed starting from 0.

With build_traces_sequential, the PerImageSliceResults are simply concatenated with no reassignment of spot_ids, leading to spots from different image volumes sharing the same spot_id.

This impacts downstream steps in the pipeline like AssignSpots.label() , which uses the spot_id to label spots with cell_id. What is happening there is AssignSpots.label() is getting the spot_id of spots that are in each cell, and then labeling the spots using their row_index aka features.

Steps/Code to Reproduce

code to reproduce bug

Expected Results

Actual Results

mattcai commented 4 years ago

closed by #1872