Closed JinSooJoo closed 9 months ago
Hi @JinSooJoo,
This only means that the feature labels in counts
don't match the identifier (target) column in net
. For example, if your counts
data uses mouse gene symbols and your net
contains human gene symbols you get this error message. If you check yourself the row names of counts
and the contents of net
, very likely you will realize the issue.
Hi @deeenes,
I appreciate your help.
Both of my counts
and net
are from human resources. counts are from GSE186352 and I used net for net <- get_collectri(organism='human', split_complexes=FALSE)
.
Are you saying that length(intersect(rownames(counts), net$target)) > 5L
?
Rownames of counts
and contents of net
do match because they are both from human data. In regarding your comments, I got the following code:
> length(intersect(rownames(counts), net$target)) > 5L
[1] FALSE
Rownames of
counts
and contents ofnet
do match because they are both from human data.
There might be many other reasons why these don't match, e.g. different ID types
In regarding your comments, I got the following code:
> length(intersect(rownames(counts), net$target)) > 5L [1] FALSE
Well, if there are not even 5 common elements, very likely these two vectors contain completely different kind of stuff. Best to check it manually:
head(sort(unique(rownames(counts))))
head(sort(unique(net$target)))
Two vectors are actually containing common elements, human genes, but my counts are shown as numeric features as:
> head(sort(unique((rownames(counts)))))
[1] "1" "10" "100" "1000" "10000" "10001"
> head(sort(unique(net$target)))
[1] "A2M" "A2ML1" "A4GALT" "AACS" "AANAT" "AAR2"
Is there any possible way to change numeric features into gene name? I also attach another code:
> head(sort(unique((counts$gene))))
[1] "A1BG" "A1BG.AS1" "A2M" "A2M.AS1" "A4GALT" "AAAS"
> head(sort(unique(net$target)))
[1] "A2M" "A2ML1" "A4GALT" "AACS" "AANAT" "AAR2"
Indeed, for decoupleR
to work counts
should be a numeric matrix with row names matching the net$target
. I suggest you to set the row names on your data frame, and convert it to a numeric matrix:
rownames(counts) <- counts$gene
counts <-
as.matrix(counts[,Filter(function(x){is.numeric(counts[[x]])}, colnames(counts))])
# you can check if the result is indeed a numeric matrix:
is.numeric(counts)
# [1] TRUE
is.matrix(counts)
# [1] TRUE
# and if the row names are correct:
length(intersect(rownames(counts), net$target))
# [1] 14223
If all looks fine, try the decoupleR
call with this counts
matrix.
Dear deeenes,
Thank you for your enormous help - the issue is resolved now!
Regards, Jin Soo Joo
Hello, firstly I appreciate so much for the excellent package.
I'm having some problem with running statistics tools - such as run_aucell, run_ulm or run_wmean. Some of the public GEO data would work, but some are not. In this case, the error below is shown.
If this were to be unmatching of counts and net, what could be the possible solution? Thank you for your time and help!