farrellja / URD

URD - Reconstruction of Branching Developmental Trajectories
GNU General Public License v3.0
115 stars 41 forks source link

Error in calcTsne and calcDM: length of 'dimnames' [1] not equal to array extent #79

Open Sophia409 opened 2 years ago

Sophia409 commented 2 years ago

Hello, I have just started to use URD. I create an URD object by fetching data from Seurat3. But I got the following error when I run calcTsne and calcDM:

First error is 'Remove duplicates before running TSNE'.But After checking my data, I didn't find duplicated genes or cell names. Second error is : 'Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent', I searched this question many times but didn't find the reason.

count.data <- as.matrix(PVN.neuron@assays$RNA@counts) meta <- PVN.neuron@meta.data meta <- tibble::rownames_to_column(meta, "CELL") rownames(meta) <- meta$CELL URD.object <- createURD(count.data =count.data, meta = meta, min.cells=3, min.counts=3)

URD.object <- calcPCA(URD.object, mp.factor = 2) [1] "2021-12-06 14:38:09: Centering and scaling data." [1] "2021-12-06 14:38:25: Removing genes with no variation." [1] "2021-12-06 14:38:29: Calculating PCA." [1] "2021-12-06 14:53:24: Estimating significant PCs." [1] "Marchenko-Pastur eigenvalue null upper bound: 11.3403811414051" [1] "8 PCs have eigenvalues larger than 2 times null upper bound." [1] "Storing 16 PCs." pcSDPlot(URD.object)

Calculate tSNE

set.seed(19) URD.object <- calcTsne(URD.object) Error in Rtsne.default(as.matrix(object@pca.scores[, which.dims]), dims = 2, : Remove duplicates before running TSNE.

Check duplicates

anyDuplicated(rownames(PVN.neuron)) [1] 0 anyDuplicated(colnames(PVN.neuron)) [1] 0

Calculate calcDM

URD.object <- calcDM(URD.object) [1] "destiny determined an optimal global sigma of 77.567" [1] "destiny will use 2950 nearest neighbors." Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent In addition: Warning messages: 1: In dataset_extract_doublematrix(data, vars) : Duplicate rows removed from data. Consider explicitly using df[!duplicated(df), ] 2: In dataset_extract_doublematrix(data, vars) : Duplicate rows removed from data. Consider explicitly using df[!duplicated(df), ] 3: In DiffusionMap(data.use, sigma = sigma.use, k = knn, n_eigs = dcs.store, : You have 20577 genes. Consider passing e.g. n_pcs = 50 to speed up computation. `

It would be great if anyone has ideas about how to fix this error. Thanks. @farrellja @zouter @maximilianh

trucnguyen89 commented 2 years ago

Hi,

I encountered the same error when running calcDM: Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent

I would be thankful if anyone suggests some solutions for this . Thanks so much!

farrellja commented 2 years ago

Hi @Sophia409 and @trucnguyen89 This is because calculating tSNE or a diffusion map fail if there are duplicate points. Not duplicates by name, but by coordinates (i.e. two cells in your data have the exact same expression values across all of the variable genes you're calculating on.). I will need to add a check for this in the URD functions. To get started immediately, you can identify them and remove them. Try duplicated(Matrix::t(object@logupx.data[obj@var.genes,])) to identify them and then subset them out of your data -- usually it is just one or two cells.