HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
66 stars 30 forks source link

Error in prepData(fs, panel, md, features = panel$fcs_colname) : all(unlist(md_cols) %in% names(md)) is not TRUE #363

Closed CalmRealistic closed 11 months ago

CalmRealistic commented 12 months ago

Hi I am having en error when I reach prepDAta and I did not find the solution by searching it. This is the error Error in prepData(fs, panel, md, features = panel$fcs_colname) : all(unlist(md_cols) %in% names(md)) is not TRUE

FCS files have the same names as the md, nothing bizzare there.

This is my code

library(CATALYST) library(cowplot) library(flowCore) library(diffcyt) library(scater) library(SingleCellExperiment) library(readxl)

panel<- read_excel("Catalyst/ panel.xlsx") md<-read_excel("Catalyst/sample.data.xlsx") View(md)

fs <- read.flowSet(files = fcs_files, truncate_max_range = FALSE, ignore.text.offset=TRUE)

construct SingleCellExperiment

sce <- prepData(fs, panel, md, features = panel$fcs_colname)



This is the session info

R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.6.7

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] readxl_1.4.3                scater_1.24.0               ggplot2_3.4.2              
 [4] scuttle_1.6.3               diffcyt_1.16.0              flowCore_2.8.0             
 [7] cowplot_1.1.1               CATALYST_1.20.1             SingleCellExperiment_1.18.1
[10] SummarizedExperiment_1.26.1 Biobase_2.56.0              GenomicRanges_1.48.0       
[13] GenomeInfoDb_1.32.4         IRanges_2.30.1              S4Vectors_0.34.0           
[16] BiocGenerics_0.42.0         MatrixGenerics_1.8.1        matrixStats_1.0.0          

loaded via a namespace (and not attached):
  [1] backports_1.4.1             circlize_0.4.15             drc_3.0-1                  
  [4] plyr_1.8.8                  igraph_1.5.0.1              ConsensusClusterPlus_1.60.0
  [7] splines_4.2.1               BiocParallel_1.30.4         scattermore_1.2            
 [10] TH.data_1.1-2               digest_0.6.33               foreach_1.5.2              
 [13] htmltools_0.5.5             viridis_0.6.4               fansi_1.0.4                
 [16] magrittr_2.0.3              ScaledMatrix_1.4.1          CytoML_2.8.1               
 [19] cluster_2.1.4               doParallel_1.0.17           limma_3.52.4               
 [22] aws.signature_0.6.0         ComplexHeatmap_2.12.1       RcppParallel_5.1.7         
 [25] sandwich_3.0-2              flowWorkspace_4.8.0         cytolib_2.8.0              
 [28] jpeg_0.1-10                 colorspace_2.1-0            ggrepel_0.9.3              
 [31] xfun_0.40                   dplyr_1.1.2                 crayon_1.5.2               
 [34] RCurl_1.98-1.12             jsonlite_1.8.7              hexbin_1.28.3              
 [37] graph_1.74.0                lme4_1.1-34                 survival_3.5-5             
 [40] zoo_1.8-12                  iterators_1.0.14            glue_1.6.2                 
 [43] polyclip_1.10-4             gtable_0.3.3                nnls_1.4                   
 [46] zlibbioc_1.42.0             XVector_0.36.0              GetoptLong_1.0.5           
 [49] DelayedArray_0.22.0         ggcyto_1.24.1               BiocSingular_1.12.0        
 [52] car_3.1-2                   Rgraphviz_2.40.0            shape_1.4.6                
 [55] abind_1.4-5                 scales_1.2.1                pheatmap_1.0.12            
 [58] mvtnorm_1.2-2               edgeR_3.38.4                DBI_1.1.3                  
 [61] rstatix_0.7.2               Rcpp_1.0.11                 plotrix_3.8-2              
 [64] viridisLite_0.4.2           clue_0.3-64                 rsvd_1.0.5                 
 [67] FlowSOM_2.4.0               httr_1.4.6                  RColorBrewer_1.1-3         
 [70] pkgconfig_2.0.3             XML_3.99-0.14               farver_2.1.1               
 [73] deldir_1.0-9                locfit_1.5-9.8              utf8_1.2.3                 
 [76] tidyselect_1.2.0            rlang_1.1.1                 reshape2_1.4.4             
 [79] cellranger_1.1.0            munsell_0.5.0               tools_4.2.1                
 [82] cli_3.6.1                   generics_0.1.3              broom_1.0.5                
 [85] ggridges_0.5.4              aws.s3_0.3.21               evaluate_0.21              
 [88] stringr_1.5.0               fastmap_1.1.1               yaml_2.3.7                 
 [91] knitr_1.43                  purrr_1.0.2                 nlme_3.1-162               
 [94] sparseMatrixStats_1.8.0     RBGL_1.72.0                 xml2_1.3.5                 
 [97] compiler_4.2.1              rstudioapi_0.15.0           beeswarm_0.4.0             
[100] curl_5.0.1                  png_0.1-8                   ggsignif_0.6.4             
[103] tibble_3.2.1                tweenr_2.0.2                stringi_1.7.12             
[106] lattice_0.21-8              Matrix_1.5-3                nloptr_2.0.3               
[109] vctrs_0.6.3                 pillar_1.9.0                lifecycle_1.0.3            
[112] GlobalOptions_0.1.2         BiocNeighbors_1.14.0        irlba_2.3.5.1              
[115] data.table_1.14.8           bitops_1.0-7                colorRamps_2.3.1           
[118] R6_2.5.1                    latticeExtra_0.6-30         gridExtra_2.3              
[121] RProtoBufLib_2.8.0          vipor_0.4.5                 codetools_0.2-19           
[124] boot_1.3-28.1               MASS_7.3-60                 gtools_3.9.4               
[127] rjson_0.2.21                withr_2.5.0                 multcomp_1.4-25            
[130] GenomeInfoDbData_1.2.8      parallel_4.2.1              ncdfFlow_2.42.1            
[133] beachmat_2.12.0             grid_4.2.1                  minqa_1.2.5                
[136] tidyr_1.3.0                 ggpointdensity_0.1.0        DelayedMatrixStats_1.18.2  
[139] rmarkdown_2.23              carData_3.0-5               Rtsne_0.16                 
[142] ggpubr_0.6.0                ggnewscale_0.4.9            ggforce_0.4.1              
[145] base64enc_0.1-3             ggbeeswarm_0.7.2            interp_1.1-4   

Thank you 
CalmRealistic commented 12 months ago

I tried this, but it did not work

  md_cols = list(file = "Filename",  factors = c("Response",
    "Batch"))
    ids0 <- md[[md_cols$file]]
    ids1 <- fsApply(fs, identifier)
    ids2 <- keyword(fs, "FILENAME")
    if (length(unlist(ids2)) == length(fs))
        ids2 <- basename(ids2)
    check1 <- all(ids1 %in% ids0)
    check2 <- all(ids2 %in% ids0)
    ids_use <- which(c(check1, check2))[1]
    ids <- list(ids1, ids2)[[ids_use]]
    if (is.null(ids)) {
        stop("Couldn't match 'flowSet'/FCS filenames\n", 
            "with those listed in 'md[[md_cols$file]]'.")
    } else {
        # reorder 'flowSet' frames according to metadata table
        fs <- fs[match(md[[md_cols$file]], ids)]
    }
HelenaLC commented 12 months ago

The error says “all(unlist(md_cols) %in% names(md)) is not TRUE”, so could you please post names(md) and how your running prepData() when the error occurs? That check that fails is independent of the FCS files.

CalmRealistic commented 12 months ago

This is what names(md) gives [1] "Filename" "Group" "Number" "Response" "Batch"

and I am sorry, I did not understand the "how I am running prepdata".

HelenaLC commented 11 months ago

Yeah so the names(md) dont match any of the defaults of the md_cols argument in prepData(). Have a look at ?prepData and adapt the md_cols argument to match your md table, and the error should be resolved. Or, alternatively, rename your md columns to match the defaults.

CalmRealistic commented 11 months ago

I changed my excell and now I have this I renamed some

names(md) [1] "file_name" "Group" "sample_id" "Condition" "Batch"

And would like some additional factors, but my code is wrong

factors <- list(factors = c("condition", "Group", "Batch")) sce <- prepData(fs, panel, md, features = panel$fcs_colname, md_cols=factors)



what should it be? 
HelenaLC commented 11 months ago

…Capital letter “Condition”?

CalmRealistic commented 11 months ago

Thank you, I guess the capital letter worked, because R was running really long time, but I will do the rest on my work computer, since the one I have crushed 2 times already.