Closed XylotrupesGideon closed 2 weeks ago
thanks for reporting this, what is the dtype for the frames saved with napari?
I think our tiffs are large because we weren't using compression, I've accepted a pull request with this. but that shouldn't change the underlying dtype
The dtype is np.ndarray after conversion.
I am still trying to figure out what exactly napari is doing to compress the labels.
Okay is seems that this is the function it uses : It seems to just create an integer ndarray and then populate it successively with the labels. though it goes throught some other functions which as far as I can see should however not influence the output of a cellpose segmentation.
napari/napari/layers/shapes /_shape_list.py
def to_labels(self, labels_shape=None, zoom_factor=1, offset=(0, 0)):
"""Returns a integer labels image, where each shape is embedded in an
array of shape labels_shape with the value of the index + 1
corresponding to it, and 0 for background. For overlapping shapes
z-ordering will be respected.
Parameters
----------
labels_shape : np.ndarray | tuple | None
2-tuple defining shape of labels image to be generated. If non
specified, takes the max of all the vertices
zoom_factor : float
Premultiplier applied to coordinates before generating mask. Used
for generating as downsampled mask.
offset : 2-tuple
Offset subtracted from coordinates before multiplying by the
zoom_factor. Used for putting negative coordinates into the mask.
Returns
-------
labels : np.ndarray
MxP integer array where each value is either 0 for background or an
integer up to N for points inside the corresponding shape.
"""
if labels_shape is None:
labels_shape = self.displayed_vertices.max(axis=0).astype(int)
labels = np.zeros(labels_shape, dtype=int)
for ind in self._z_order[::-1]:
mask = self.shapes[ind].to_mask(
labels_shape, zoom_factor=zoom_factor, offset=offset
)
labels[mask] = ind + 1
return labels
thanks okay I'll see if I can replicate the increased memory usage and reduce it, we are using uint16 or uint32. which OS are you on and what were the dimensions of the stack (size in x,y,z)?
okay I am not using a big enough stack to replicate large differences in RAM but I found where it could be slowed down - due to a type cast. inside stitch3D we're using int, not the dtype of the masks from cellpose (which are usually uint16). I've updated the code to use masks.dtype
, but I don't think should make much of a difference.
another alternative is that you have a bunch of small masks (<15 pixels) that are thrown out when not stitching but remain when stitching, and that's what is slowing things down. you can test this by turning off min_size when running plane-by-plane (model.eval(..., min_size=-1)
) and seeing if you find a lot of small masks
going to close this for now due to inactivity, please upgrade to the latest cellpose for these features pip install git+https://github.com/mouseland/cellpose.git
Use a Napari-labels-like data format to reduce the size of Cellpose mask output before stitching
I was working with a large (14 GB) image stack and segmented it in cellpose. I was able to segment the individual z-planes, however when running 3d stitch I ran out of memory very quickly.
I then realized that by loading the planes into Napari, as label layers and then immediately saving them I could reduce the file size by 56x(!) (Cellpose output 18026 KB per z-plane vs 46 KB per z-plane when saved as Napari label layer). This enabled me to effortless run the 3D stitch on my computer.
I am not sure in what way the output of Napari differs from that of cellpose however this difference in file size was an absolute life saver for me and I guess It would be very helpful for anyone else that does not have access to a lot of RAM.
I would suggest to take a look at the implementation of Napari and see if it can be integrated into cellpose.
The workflow I used was the following (not the most efficient route I guess but it works for me):