atakanekiz / CIPR-Package

Cluster Identity Predictor (R package implementation)
18 stars 3 forks source link

Error: Problem with `filter()` input `..1`. #1

Closed najeha13 closed 4 years ago

najeha13 commented 4 years ago

I used a seurat object with 20K features, normalized and scaled. The markers were found using FindAllMarkers(). However, shortly after using CIPR() , this error keeps appearing.

Error: Problem with `filter()` input `..1`. x object 'cluster' not found i Input `..1` is `cluster == i`.

after running rlang::last_error() The following is what I got.

`<error/dplyr_error> Problem withfilter()input..1. x object 'cluster' not found i Input..1iscluster == i`. Backtrace:

  1. CIPR::CIPR(...)
    1. dplyr::filter(., cluster == i)
    2. dplyr:::filter_rows(.data, ...)
    3. base::tryCatch(...)
    4. base:::tryCatchList(expr, classes, parentenv, handlers)
    5. base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
    6. value[3L]
    7. dplyr:::stop_dplyr(...) ``

TheError: Problem withfilter()input..1. has been fixed. however clusternot found` keeps reappearing. I am pretty sure FindAllMarkers() output has the column "cluster" in it.

atakanekiz commented 4 years ago

This sounds like a dplyr::filter() error and it suggests that in your input data frame there is no column named cluster. Can you let me know how the input data frame looks? Please provide a reproducible example --you can send me the input files you are working with.

najeha13 commented 4 years ago

This is the first 5 rows of the file im using as the input for CIPR(). I'm trying to use it with the basic function in your vignette.

CIPR(input_dat = allmarkers, comp_method = "logfc_dot_product", cluster = "All", reference = "hpca", plot_ind = F, plot_top = T, global_results_obj = T, global_plot_obj = T)

head(allmarkers) p_val avg_logFC pct.1 pct.2 p_val_adj cluster gene ENSG00000237541.3 1.010723e-40 0.4564179 0.831 0.181 3.137487e-36 0 ENSG00000237541 ENSG00000019582.10 2.123443e-38 1.9576958 1.000 0.808 6.591592e-34 0 ENSG00000019582 ENSG00000146192.10 6.372628e-37 0.4277120 0.831 0.229 1.978191e-32 0 ENSG00000146192 ENSG00000204287.9 1.136165e-33 1.8584868 1.000 0.781 3.526883e-29 0 ENSG00000204287 ENSG00000165168.6 4.998621e-32 0.6656053 0.867 0.290 1.551672e-27 0 ENSG00000165168 ENSG00000223865.6 7.020494e-32 1.5746931 1.000 0.546 2.179302e-27 0 ENSG00000223865

atakanekiz commented 4 years ago

The function matches the genes matching gene symbols (CD8A, IFNG etc.) rather than ensembl IDs. Can you try converting your ensembl IDs to gene symbols and try again?

najeha13 commented 4 years ago

Hey, so I tried as you said, converted my metadata to gene symbols and ran the function again! And this is the new error...

CIPR(input_dat = allmarkers, comp_method = "logfc_dot_product", cluster = "All", reference = "hpca", plot_ind = F, plot_top = T, global_results_obj = T, global_plot_obj = T) Preparing input data Preparing reference data Reading HCPA reference data Analyzing cluster signatures Preparing top plots stat_bindot() using bins = 30. Pick better value with binwidth. Error: Theme element cluster is not defined in the element hierarchy. Run rlang::last_error() to see where the error occurred. head(allmarkers) p_val avg_logFC pct.1 pct.2 p_val_adj cluster gene DCD 3.433888e-96 4.2158769 0.973 0.027 1.065947e-91 0 DCD CDO1 3.854518e-94 0.9654832 0.932 0.018 1.196519e-89 0 CDO1 SULT1E1 4.350549e-93 2.1605894 0.905 0.016 1.350498e-88 0 SULT1E1 SLC30A8 1.146308e-88 1.5842234 0.905 0.025 3.558368e-84 0 SLC30A8 PNMT 2.361726e-84 0.7280091 0.838 0.016 7.331271e-80 0 PNMT LMO3 3.900376e-72 1.0367486 0.878 0.054 1.210755e-67 0 LMO3

atakanekiz commented 4 years ago

I've never encountered this issue before, and I can't replicate it with the datasets I have. If you can send me your input data I will take a detailed look. You can email it to me at atakanekiz@gmail.com.

atakanekiz commented 4 years ago

The data I obtained from you seems to work when I tried on two different computers. Since I haven't heard back from you about this again and I can't replicate the problem I will close the issue for now.

ghost commented 3 years ago

I think ...might be I got the same issue. when I call CIPR() function input_dat[, gene_column] <- tolower(input_dat[, gene_column]) got some mistake it will get all genes in one cell like: gene c(\"rcan3\", \"sell\", \"gas5\", \"mal\", \"lef1\", \"il7r\", \"fyb1\", \"camk4\", \"tcf7\", \"ltb\"… c(\"rcan3\", \"sell\", \"gas5\", \"mal\", \"lef1\", \"il7r\", \"fyb1\", \"camk4\", \"tcf7\", \"ltb\"… so, the sel_clst and ref_dat can not merge as we wish, and all result from this became empty.

I turn it into : input_dat[[gene_column]] <- tolower(input_dat[[gene_column]]) and it worked

BTW, thanks for offering such a good tool, I like it!