Open MomenehForoutan opened 1 year ago
Thank you so much @MomenehForoutan. These are all excellent and well-though-out suggestions and comments. We are currently focusing mostly on some general improvements and once that's finished we hope to present better and improved tutorials in the (very) near future. At that point we will also take more time to respond to the individual points you brought up.
Thanks @MomenehForoutan just stumbled across your report when I hit the same issue.
Do you have any recommendations/experience regarding the use of these probes for normalising the data?
@MomenehForoutan I have not seen good performance with neg norm in Nanostring CTA, WTA, or CosMx. I would remove these from the express matrix. As an alternative, you could consider accounting for overall transcript cellular detection rate, for instance as an additional factor in linear modeling, but whether or how to do so likely depends somewhat on the downstream questions of interest.
Hi, first of all, thanks for this fantastic tool.
I am working on my CosMX data and I can't find a way to save the Workspace nor the Giotto object in RDS. I think the polygon data is not being properly saved so I can't then resume the analysis. I am using Giotto version 2.0.0.998 and all is working fine, the only issue I have is that I can't find a way to load the Workspace or the object and not have error problems related to coordinates. Are there any solutions about this?
Thanks a lot in advance.
Hi team, Thanks for the great package and all the useful tutorials for different types of spatial data! I have been following the manual for the CosMx data and faced some errors/issues, leading to the results being non-reproducible. So I thought I would summarise them here in case you would like to update the manual in the future.
There are two examples for calculating HVFs in the manual:
One of the approaches to calculate HVFs uses
expression_values = "pearson"
, however, is not clear how these are calculated. Specifically, runningcalculateHVFs()
withexpression_values = "pearson"
gives an Error (see below). Looking at the normalisation types there are three methods ("standard", "pearson_resid", "osmFISH") and the default is "standard", which is what has been used in the manual. I thought perhaps changing the normalisation step to"person_resid"
would help but I get a message saying "Pearson residual normalised data will be returned to the scaled Giotto slot" when runningnormaliseGiotto()
, and I get an Error asking me to do normalisation first when runningaddStatistics()
in the following step. The same argumentexpression_values = 'pearson'
is used when running PCA in the vignette, which gives an Error as well (see below). So it is unclear how"pearson"
was calculated.In the other approach for calculating HVFs, you have set
HVFname = "hvg_orig"
, while in the following codes you are trying to access the HVFs using"hvf"
; soHVFname = "hvg_orig"
needs to be changed toHVFname = "hvf"
when runningcalculateHVF()
to avoid getting Errors when runninggene_meta[hvf == 'yes']
andrunPCA()
in the following steps.Please see the below chunk of code for the points mentioned above.
When subsetting the data, the variable name you use is
smallfov
, however, you are usingtest
in the followingspatInSituPlotPoints()
, sotest
needs to be replaced withsmallfov
to avoid the Error due to not defining thetest
variable. Similarly,testfeats
should be replaced bysmallfeats
. See the relevant code from the manual for your convenience.Finally, I have a couple of general questions/comments about the analysis the CosMx data:
Considering that the current CosMx data sets have < 1000 selected genes, would not it be better to use all genes when performing PCA rather than using only a small number of HVFs? I seem to get better dimensional reduction plots when using all features. What are your thoughts?
There are 20 negative probes in the CosMx data, and interestingly some are amongst the HVFs. While in the processed Giotto object provided by Nanostring the negative probes are stored in a separate slot, you do not seem to have separated them from the rest of the genes in the manual... So I was wondering if you think it is better to keep them in the expression matrix with other genes when processing the data or storing them separately? Do you have any recommendations/experience regarding the use of these probes for normalising the data?
Many thanks and looking forward to hearing back from you! Cheers.