Open amanda-hi opened 6 years ago
Hi Amanda,
Thank you for your interest in our approach!
I have experienced this type of error. I suspect that the distance matrix which is calculated in sct.TSNE.calc_TSNE() has NaN, infinity, or large values. By default, the distance matrix is calculated using pairwise Pearson correlation. You might have pair(s) of genes that yield badly behaved correlation values. For instance, if the standard deviation of one of the genes is zero, then the correlation will be NaN.
I would suggest running the distance matrix code in calc_TSNE() directly on your count matrix of ICIM-selected genes. Then check whether there are any badly behaved values. If there are, I'd consider either removing those genes or masking the values in an appropriate way (e.g. filling in zeros or ones) and passing the distance matrix directly to the TSNE method (you might have to modify sct.py with an updated TSNE method to allow this).
Let me know whether that helps. Good luck!
Felix
Hey Felix! Thanks for the quick response!
I went in and used this code to generate my own distance matrix:
dist = 1-X.corr()
dist = np.clip(dist, 0.0, max(np.max(dist)))
I fed this dist matrix straight into the calc_TSNE()
function in the sct script, which worked (woo!), but then got an error at the plot() step that said: raise KeyError("None of [%s] are in the [%s]" %
, with a list of cell barcodes in the first [%] space. Is there another parameter within the plot()
function that needs to be changed that I'm missing? It seems to me that the script is trying to color each of the cells in the list matrix, but is indexing the matrix incorrectly. I could be totally wrong about that, though.
Thanks again for your help!
You may need to initialize the TSNE object with a df_libs that is indexed by the same names as df or X.
Hello,
My colleagues and I are trying to adapt your ICIM script for our own differential expression project with olfactory epithelium, however, we've been running into quite a few errors and would like some clarification. I was able to run your "ICIM_example" script up until the "Display cells using tSNE" step, where I received the following error raised by the
sklearn
module:I am feeding in a counts matrix and metadata file to the "Load Data" step, neither of which contain infinite values. I otherwise have not changed any parameters from the ICIM_example and am not sure why we're getting this error. I've cloned the repo and adjusted the file paths accordingly. Do you have any suggestions for how to get around this? The actual marker gene identification step seems to have been successful, identifying 319 genes. We are incredibly excited to use this script but have been running into problems getting it going!
Thanks so much,