gustaveroussy / sopa

Technology-invariant pipeline for spatial omics analysis (Xenium / Visium HD / MERSCOPE / CosMx / PhenoCycler / MACSima / ...) that scales to millions of cells
https://gustaveroussy.github.io/sopa/
BSD 3-Clause "New" or "Revised" License
123 stars 14 forks source link

Speeding up Sopa explorer write #29

Closed pakiessling closed 7 months ago

pakiessling commented 7 months ago

Hi, I am working my way through the Sopa Snakemake pipeline with one of our Merfish datasets.

A step that has been problematic is the sopa explorer write.

This takes a long time and the job was canceled after 12 hours.

I think the problem is that we have 11 stains all in all at 87012 x 64791 pixels. The writing of tiles takes forever.

Is there a way to paralellize this or maybe only take a subset of stains?

Thanks!

quentinblampey commented 7 months ago

Hello @pakiessling,

Indeed, the image writing can be long, this is why we start writing the image very soon in the pipeline: this way, we perform the other steps in parallel and finally update the remaining explorer files.

In 12 hours, did it at least write the first image scale?

Note that we have two writing modes: a lazy-mode that is memory efficient but slower, and one mode that is faster because we load the image in memory. Depending on your RAM capacity, we automatically decide which mode we use. Therefore, asking for more RAM will load the image in memory only when possible, speeding up the writing process for small subscales. But it will probably not solve the writing of the highest resolution image, which probably can't be loaded in memory even if you set a high RAM.

I'll try to see if it's possible to parallelize the writing process in "lazy-mode"!

pakiessling commented 7 months ago

It took ~ 9 hours for the first level and than ~3 for 35% of the 2nd.

I will try with a more generous time limit and more RAM. Would it help to convert the images to OME-TIFF before?

quentinblampey commented 7 months ago

Okay, it's great that at least it converts the first level. For the second level, I think you'll need about 116GB of RAM according to your image size, which is a lot... I don't know how much RAM you can ask for, but if possible can you try with 128GB of RAM? At least, starting from the second level, it should load the image in memory and make it faster. If it works, can you tell me how much RAM it used and how long it was?

No the conversion to OME-TIFF will not be helpful, except if it's already the format expected by the Xenium Explorer. Which conversion would you use?

pakiessling commented 7 months ago

Our nodes max out at 180 GB, so that should be fine.

Vizgen has a utility for converting images here https://github.com/Vizgen/vizgen-postprocessing/blob/166fb247d235e021d26205a74ce3814f885aee4b/src/vpt/convert_to_ome/main.py#L15

Do I need to set a a higher ram-threshold? I think the default is 4?

quentinblampey commented 7 months ago

Ok great! Yes, sorry I forgot to mention, you indeed have to change the value of ram_threshold_gb (for instance, 128GB). If you use the pipeline, you can update the parameter here directly.

For the Vizgen conversion, I had a quick look, but the compression parameters are different from what the Xenium Explorer expects, so I think that it's unlikely to work...

I found a way to parallelize the writing, but I think the bottleneck comes from reading the chunks. Maybe using a chunk size of 1024 will be much faster (currently, we read chunks of size 4096 for merscope data). I'll do some tests next week to find the bottleneck and improve it!

pakiessling commented 7 months ago

Thanks, sounds great!

pakiessling commented 7 months ago

Ah btw, for the .yml I think intensity_mean got changed to average-intensities at some point

quentinblampey commented 7 months ago

Thanks for the catch @pakiessling, it seems that I updated this everywhere except in the example_config.yaml file! I'll update it

quentinblampey commented 7 months ago

Hello @pakiessling,

After a quick investigation, I confirm that the bottleneck was due to loading the chunks (not writing the image). Indeed, the Xenium Explorer chunk size is (1, 1024, 1024), but by default the MERSCOPE data is saved with chunks of size (1, 4096, 4096). So, each time we write a chunk, we load a chunk that is 16 times bigger than what we need.

Now, the default chunksize is set to (1, 1024, 1024), and image-writing should now be about 5 times faster.

If you want to test it already, you can use the dev branch, else you can also wait for the release of version 1.0.5 (I'm waiting for more novel features before releasing a new version). Note that you'll need to write again the .zarr directories, else it will still use the previous chunk size of (1, 4096, 4096).

Let me know if it's better now!

pakiessling commented 7 months ago

That sound amazing. I will give it a shot!

quentinblampey commented 7 months ago

Hello again @pakiessling, I made again a recent update to make the image writing faster, the latest changes are on dev

Sorry for the quick change, I hope you haven't tried my previous changes yet... I'm still performing some tests, so dev might again have some changes, but the version sopa==1.0.5 will be stable when I will release it

quentinblampey commented 7 months ago

Version 1.0.5 is released, if you want to test it out

pakiessling commented 7 months ago

Cool, Im running it right now with default settings for the explorer step. Will report back how long it took.

quentinblampey commented 7 months ago

Great, let me know! Have you overwritten your spatialdata .zarr directory? You need to create it again, because else you'll still have the old chunk size, which was the main reason for latency. If you are not sure, just check that the image chunk size is indeed (C, 1024, 1024)

pakiessling commented 7 months ago

Yes I am rerunning from scratch.

Clueless question, but what does the writing of the tiles actually do? Is it just to be able to look at it with the Xeniume explorer?

If I load in a Merfish dataset with spatialdata.io and save as .zarr it is very fast for example.

quentinblampey commented 7 months ago

Yes, this is only used to open the results in the Xenium Explorer: all it does is create a new image with the metadata/chunks/subscales/compression intended by the Xenium Explorer.

Now the image writing should be as fast as the .zarr writing, hopefully :)

pakiessling commented 7 months ago

Ok it ran through in ~ 6 hours now. If the process is started at the beginning of the pipeline this should not be a bottle neck anymore. Very nice!

quentinblampey commented 7 months ago

Great, I'm glad to hear this, thanks for your feedback!