Closed Knight1995 closed 6 months ago
You should start from the warning messages. X.sub
is probably shorter than K1[, 2]
.
Also, ind
spans the column indices by default in big_apply()
, but here you're using it for the rows.
Thanks for your quick reply! Actually,I have edited my code to calculate each row's result,but the result is also NA. colmeans <- big_apply(X1, ind = rows_along(X),function(X, ind) { X.sub <- X[ind,1]
K1<-map_dfr(unique(X[,1]),function(i){ S1 <-mean(Y[which(X[,1]==i),1]) data.frame(Value=S1,clu=i) })
a<-K1[which(K1[,2]==X.sub),1] b<-min(K1[which(K1[,2]!=X.sub),1]) si=(b-a)/max(b,a) return(si) }, a.combine = 'c')
Yes, cf. my first comment.
Sorry for bothering again.When I test single numble (ind=1), my code works.But I put the code into the big_apply,the results are NA. What is the problem? Does the R algorithm not work in big_apply? Thanks.
No, your code doesn't work when using ind <- 1
.
It is just that X.sub
is of length 1 and gets automatically recycled to match the size of K1[, 2]
.
Which is probably not what you want.
You need to think about what you are trying to achieve here.
If I had to guess, I would say that you need to subset K1[ind, 2]
.
Thanks for your reply. In order to find out the problem,i try a simple test as following.I think it may be that I didn't input one of the two variables, Y, so there is no result. But after I rewrite the code like your multivariate format (https://privefl.github.io/bigstatsr/articles/big-apply.html) , there is still no result output, which is very wired.Could you give me some suggestions? Thanks.
mean(Y[-ind, ])
is very odd (especially the minus). What are you trying to achieve here (in simple English)?summary(Y)
?'ind' means the row number, mean(Y[-ind, ]) means that the matrix in this row will be removed, and the mean of new matrix will be calculated. 'Summary(Y)' shows as following.
I didn't get that Y
was also an FBM. Then summary(Y[])
.
You understand that ind
is usually a vector of multiple indices, not just one, right?
And you want the full mean()
of the matrix? Not something like the rowMeans()
?
Yes, I probably understand what you mean. I tested the simple example above to know how to rewrite the a.FUN in big_apply step by step.My original R code is below. Because the matrix is too big and it runs too slowly, I want to realize this function by using big_apply.cluster_info and dist, which are the original matrix. Their row names and number of rows are the same.
K3<-future_map_dfr(seq(ncol(cluster_info)),function(Y){
K2<-map_dfr(seq(nrow(cluster_info)),function(index){
x <-cluster_info[,Y]
dist2 <- as.data.frame(cbind(x,dist))[-index,]
K1<-map_dfr(unique(x),function(i){
d<-mean(dist2[which(dist2$x==i),index+1])
#d<-sum(dist2[ which(dist2$x==i),index+1])/length( which(dist2$x==i))
data.frame(Value=d,clu=i)
})
si <- (min(K1[K1$clu!=x[index],]$Value)-K1[K1$clu==x[index],]$Value)/max(min(K1[K1$clu!=x[index],]$Value),K1[K1$clu==x[index],]$Value)
if(is.na(si)){
data.frame(cluster=x[index],sil_width=0)
}else{
data.frame(cluster=x[index],sil_width=si)
}
})
data.frame(Resolution=colnames(cluster_info)[Y],silhouette_score=mean(K2$sil_width))
})
I don't get what you're trying to achieve here; sorry I cannot help.
Thanks for the great job! I try to do some easy calculations on the FBM object, I want to get the 3774 rows' result,but the result is NA Could you please tell me what the problem is? Thanks!