Feedback & comments on the Giotto manual for CosMx data

MomenehForoutan commented 1 year ago

Hi team, Thanks for the great package and all the useful tutorials for different types of spatial data! I have been following the manual for the CosMx data and faced some errors/issues, leading to the results being non-reproducible. So I thought I would summarise them here in case you would like to update the manual in the future.

There are two examples for calculating HVFs in the manual:

One of the approaches to calculate HVFs uses expression_values = "pearson", however, is not clear how these are calculated. Specifically, running calculateHVFs() with expression_values = "pearson" gives an Error (see below). Looking at the normalisation types there are three methods ("standard", "pearson_resid", "osmFISH") and the default is "standard", which is what has been used in the manual. I thought perhaps changing the normalisation step to "person_resid" would help but I get a message saying "Pearson residual normalised data will be returned to the scaled Giotto slot" when running normaliseGiotto(), and I get an Error asking me to do normalisation first when running addStatistics() in the following step. The same argument expression_values = 'pearson' is used when running PCA in the vignette, which gives an Error as well (see below). So it is unclear how "pearson" was calculated.
In the other approach for calculating HVFs, you have set HVFname = "hvg_orig", while in the following codes you are trying to access the HVFs using "hvf"; so HVFname = "hvg_orig" needs to be changed to HVFname = "hvf" when running calculateHVF() to avoid getting Errors when running gene_meta[hvf == 'yes'] and runPCA() in the following steps.

Please see the below chunk of code for the points mentioned above.

fov_join <- calculateHVF(gobject = fov_join,
            method = 'var_p_resid',
            expression_values = 'pearson',
            show_plot = T)
Error in get_expression_values(gobject = gobject, spat_unit = spat_unit,  : 
  The spatial unit cell for expression matrix rna and with name 'pearson' can not be found 

# "hvg_orig" needs to be replaced with "hvf" in the below code:
fov_join <- calculateHVF(gobject = fov_join, HVFname = 'hvg_orig')

## again it is not clear what pearson is
fov_join <- runPCA(gobject = fov_join,
           expression_values = 'pearson',
           scale_unit = F,
           center = F
           )
Error in (function (classes, fdef, mtable)  : 
 unable to find an inherited method for function ‘plotPCA’ for signature ‘"missing"’

When subsetting the data, the variable name you use is smallfov, however, you are using test in the following spatInSituPlotPoints(), so test needs to be replaced with smallfov to avoid the Error due to not defining the test variable. Similarly, testfeats should be replaced by smallfeats. See the relevant code from the manual for your convenience.

locs <-fov_join@spatial_locs$cell$raw

#subset a Giotto object based on spatial locations
smallfov <- subsetGiottoLocs(fov_join,
            x_max = 800,
            x_min = 507,
            y_max = -158800,
            y_min = -159600)

#extract all genes observed in new object
smallfeats <- smallfov@feat_metadata$cell$rna$feat_ID

# plot all genes
spatInSituPlotPoints(test,
            feats = list(testfeats),
            point_size = 0.15,
            polygon_line_size = .1,
            show_polygon = T,
            polygon_color = 'white',
            show_image = T,
            image_name = image_names,
            coord_fix_ratio = TRUE,
            show_legend = FALSE)

Finally, I have a couple of general questions/comments about the analysis the CosMx data:

Considering that the current CosMx data sets have < 1000 selected genes, would not it be better to use all genes when performing PCA rather than using only a small number of HVFs? I seem to get better dimensional reduction plots when using all features. What are your thoughts?
There are 20 negative probes in the CosMx data, and interestingly some are amongst the HVFs. While in the processed Giotto object provided by Nanostring the negative probes are stored in a separate slot, you do not seem to have separated them from the rest of the genes in the manual... So I was wondering if you think it is better to keep them in the expression matrix with other genes when processing the data or storing them separately? Do you have any recommendations/experience regarding the use of these probes for normalising the data?

Many thanks and looking forward to hearing back from you! Cheers.

RubD commented 1 year ago

Thank you so much @MomenehForoutan. These are all excellent and well-though-out suggestions and comments. We are currently focusing mostly on some general improvements and once that's finished we hope to present better and improved tutorials in the (very) near future. At that point we will also take more time to respond to the individual points you brought up.

swbioinf commented 1 year ago

Thanks @MomenehForoutan just stumbled across your report when I hit the same issue.

andrewrech commented 1 year ago

Do you have any recommendations/experience regarding the use of these probes for normalising the data?

@MomenehForoutan I have not seen good performance with neg norm in Nanostring CTA, WTA, or CosMx. I would remove these from the express matrix. As an alternative, you could consider accounting for overall transcript cellular detection rate, for instance as an additional factor in linear modeling, but whether or how to do so likely depends somewhat on the downstream questions of interest.

JuanLSK commented 1 year ago

Hi, first of all, thanks for this fantastic tool.

I am working on my CosMX data and I can't find a way to save the Workspace nor the Giotto object in RDS. I think the polygon data is not being properly saved so I can't then resume the analysis. I am using Giotto version 2.0.0.998 and all is working fine, the only issue I have is that I can't find a way to load the Workspace or the object and not have error problems related to coordinates. Are there any solutions about this?

Thanks a lot in advance.

drieslab / Giotto

Feedback & comments on the Giotto manual for CosMx data #366