raymondlouie / MiniMarS

4 stars 2 forks source link

Error with xgBoost when using scRNA-seq data #36

Closed dhrutiparikh closed 1 year ago

dhrutiparikh commented 1 year ago

I was running the scRNA-seq dataset with 2000 variable features and only three cell types. While I could get a list of markers using scc2marker, geneBasis & citeFuse, I got the following error when using xgBoost.

xgBoost

list_markers_time = findClusterMarkers(final_out$training_matrix,

  • final_out$training_clusters,
  • num_markers = 15,
  • method = "xgBoost",
  • verbose = TRUE) Methods used in this analysis: xgBoost

Caclulating markers using xgBoost.

Error in order(fstat, decreasing = T) : unimplemented type 'list' in 'orderVector1'

Thanks!

anglixue commented 1 year ago

Hi Dhruti,

It seems the fstat is expected to be a vector but here it is a list. Did this happen for one iteration or all scenarios?

Could you run the following code to check the class of fstat? You need to input two variables here: input_matrix and clusters.

    unique_clusters = unique(clusters)
    num_clust= length(unique_clusters)
    label <- 0:(num_clust-1)
    names(unique_clusters) = label
    clusters_newlabel = unlist(lapply(clusters,
                                      function (x) as.numeric(names(unique_clusters)[which(as.character(unique_clusters) %in% x)])))

    # convert features to numbers, because xgb.importance seems to have trouble with greek letters
    marker_num = 1:dim(input_matrix)[2]
    names(marker_num) = colnames(input_matrix)
    colnames(input_matrix) = marker_num

    fstat=apply(input_matrix,2,function (x) na.omit(anova(aov(x~as.factor(clusters)))$"F value"))
dhrutiparikh commented 1 year ago

Thanks, Angli! It happens when run the code on the GitHub with no changes to the parameters.

I get the following error:

fstat=apply(input_matrix,2,function (x) na.omit(anova(aov(x~as.factor(clusters)))$"F value"))

Error in h(simpleError(msg, call)) : error in evaluating the argument 'object' in selecting a method for function 'na.omit': variable lengths differ (found for 'as.factor(clusters)') 14. h(simpleError(msg, call)) 13. .handleSimpleError(function (cond) .Internal(C_tryCatchHelper(addr, 1L, cond)), "variable lengths differ (found for 'as.factor(clusters)')", base::quote(model.frame.default(formula = x ~ as.factor(clusters), drop.unused.levels = TRUE))) 12. model.frame.default(formula = x ~ as.factor(clusters), drop.unused.levels = TRUE) 11. stats::model.frame(formula = x ~ as.factor(clusters), drop.unused.levels = TRUE) 10. eval(mf, parent.frame()) 9. eval(mf, parent.frame()) 8. stats::lm(formula = x ~ as.factor(clusters), singular.ok = TRUE) 7. eval(lmcall, parent.frame()) 6. eval(lmcall, parent.frame()) 5. aov(x ~ as.factor(clusters)) 4. anova(aov(x ~ as.factor(clusters))) 3. na.omit(anova(aov(x ~ as.factor(clusters)))$"F value") 2. FUN(newX[, i], ...) 1. apply(input_matrix, 2, function(x) na.omit(anova(aov(x ~ as.factor(clusters)))$"F value"))

dhrutiparikh commented 1 year ago

Sorry @raymondlouie & @anglixue, I just installed the Dev version instead of the main version, and there is no error anymore.