HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
67 stars 30 forks source link

Other causes of "all(unlist(md_cols) %in% names(md)) is not TRUE" error? #304

Closed 83years closed 1 year ago

83years commented 1 year ago

Hi Helena,

I have been running the pipeline sucessfully for the last year with no issues whatsoever. Typically I copy an old script into a new folder and then update the md.xlsx and panel.xlsx files so that I have a record of old scripts and the changes I make. However, today I run the pipeline on a re-exported dataset and the all(unlist(md_cols) %in% names(md)) is not TRUE error appears.

I check and double check the file names and they are correct, I then use the suggested solution to force the correct names.

# provided that the sample order in the metadata table 
# is identical to the sample order in the `flowSet`:
for (i in seq_along(fs)) {
  # replace FILENAME & identifier to be consistent and make double-sure
  description(fs[[i]])$FILENAME <- identifier(fs[[i]]) <- md$file_name[i]
}

keyword(fs, "FILENAME") %in% md$file_name reports all TRUE

Older versions of the scripts all run correctly. Where else can this error?

session info

R version 4.1.3 (2022-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_3.3.6               cytofWorkflow_1.18.0        cowplot_1.1.1               uwot_0.1.11                
 [5] Matrix_1.4-0                HDCytoData_1.14.0           flowCore_2.6.0              ExperimentHub_2.2.1        
 [9] AnnotationHub_3.2.2         BiocFileCache_2.2.1         dbplyr_2.1.1                diffcyt_1.14.0             
[13] CATALYST_1.18.1             SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0 Biobase_2.54.0             
[17] GenomicRanges_1.46.1        GenomeInfoDb_1.30.1         IRanges_2.28.0              S4Vectors_0.32.4           
[21] BiocGenerics_0.40.0         MatrixGenerics_1.6.0        matrixStats_0.62.0          readxl_1.4.0               
[25] knitr_1.39                  BiocStyle_2.22.0        
83years commented 1 year ago

After some more troubleshooting it turns out that the issue was with a change in the order of the markers in the Panel.xlsx file. For some reason FlowJo decided to output them differently this time.

I am still not sure why this caused the above error but it's fixed so closing out the ticket.