TrigosTeam / SPIAT

https://trigosteam.github.io/SPIAT/
Artistic License 2.0
21 stars 8 forks source link

calculate_pairwise_distances_between_celltypes not working #20

Closed JessicaP94 closed 1 year ago

JessicaP94 commented 1 year ago

Hello, I am trying to calculate distances but receive the following error:

distances <- calculate_pairwise_distances_between_celltypes( spe_object = formatted_image, cell_types_of_interest = c("Tumour", "vCAF_2", "Endothelial", "Immune","vCAF_3","mCAF_2","mCAF_1"), feature_colname = "Cell.Type") Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'as.matrix': negative length vectors are not allowed

Can you help in solving it?

fuerzhou commented 1 year ago

Hi @JessicaP94,

Thank you for your interest in the tool.

I can't say for sure what caused the error without reproducing it, but my guess is that the number of cells of interest is too large for the function to handle. I would suggest only passing two cell types into the function at a time and see if it works. I did a benchmarking on this function before and here is the result for your reference. When the cell number exceeds ~10k for one cell type, the session would crash for my 16GB RAM laptop.

benchmarking (I have maximised the memory usage for this RStudio session.)

Since calculating pairwise distance consumes large computing resource, I also suggest trying calculating the minimum distance and see what you get from there, as well as other colocalisation metrics.

Please let me know how you go from here.

Yuzhou

JessicaP94 commented 1 year ago

Hi @JessicaP94,

Thank you for your interest in the tool.

I can't say for sure what caused the error without reproducing it, but my guess is that the number of cells of interest is too large for the function to handle. I would suggest only passing two cell types into the function at a time and see if it works. I did a benchmarking on this function before and here is the result for your reference. When the cell number exceeds ~10k for one cell type, the session would crash for my 16GB RAM laptop.

benchmarking (I have maximised the memory usage for this RStudio session.)

Since calculating pairwise distance consumes large computing resource, I also suggest trying calculating the minimum distance and see what you get from there, as well as other colocalisation metrics.

Please let me know how you go from here.

Yuzhou

We have full slide images indeed with 355693 cells, however we also have 256 GB RAM. Do you think is still a problem?

I tried the calculate_minimum_distances_between_celltypes function and it works.

fuerzhou commented 1 year ago

I can give a rough estimate before benchmarking on a larger RAM - the function should be able to handle ~20k cells per cell type when only two cell types are passed into the cell_types_of_interest argument, on a 256 GB RAM. However, we need to anticipate a very long run time - perhaps over 2 hours. The function won't work with many cell types passed into the cell_types_of_interest argument given large number of cells in your image.

If you would like to try it out, please let me know if only using two cell types works for you. Thank you!

JessicaP94 commented 1 year ago

I can give a rough estimate before benchmarking on a larger RAM - the function should be able to handle ~20k cells per cell type when only two cell types are passed into the cell_types_of_interest argument, on a 256 GB RAM. However, we need to anticipate a very long run time - perhaps over 2 hours. The function won't work with many cell types passed into the cell_types_of_interest argument given large number of cells in your image.

If you would like to try it out, please let me know if only using two cell types works for you. Thank you!

Hello, I tried with only 2 cell types of interest but I still receive the same error. I guess we have too many cells then. I'll keep using the minimum distances then. Thank you for your support!