neurorestore / Libra

MIT License
145 stars 23 forks source link

Error in `group_by()`: Column `cell_type` is not found. #14

Closed sshakil12 closed 2 years ago

sshakil12 commented 2 years ago

Hello!

When I run run_de() with my own data, I get the error message:

"Error in group_by(): ! Must group by variables found in .data. x Column cell_type is not found."

The object input is a seurat object and the metadata includes replicate, label, and cell_type columns(approx. 10 replicates, 2 labels, and 1 cell type). I noticed other people had the same issue/ I would be happy to provide a small sample of my data to figure this out. Thank you.

Best, S.

jordansquair commented 2 years ago

if you can provide a small sample of the data that would be great.

sshakil12 commented 2 years ago

What would be the best way to provide the sample?

jordansquair commented 2 years ago

if you can send a link/attachment to my email (see my account page). thanks!

sshakil12 commented 2 years ago

Just sent it. Thank you!

jordansquair commented 2 years ago

So the problem is that your default assay is not set to your scRNAseq data within your Seurat object. Should be careful with this as we don't explicitly call the RNA assay (since sometimes people name it something else).

library(Seurat) library(Libra)

sc = readRDS("~/Downloads/sample.rds") DefaultAssay(sc) [1] "ADT" de = run_de(sc) [1] "macrophages" Error: Must group by variables found in .data.

DefaultAssay(sc) = 'RNA' de = run_de(sc)

expr = GetAssayData(sc, slot = 'counts', assay = 'RNA') meta = sc@meta.data de = run_de(expr, meta = meta)

sshakil12 commented 2 years ago

Problem fixed. Once again, thank you!

Xicici-Yan commented 1 year ago

Hi, @jordansquair I set default assay as 'RNA'. But there is still something wrong with the code. My seuratobject is like image When I run

DefaultAssay(seuItg.dft) = 'RNA'

DE = run_de(
  seuItg.dft,
  n_threads = 8
)

I got error image

Besides, I get assays directly

meta<-seuItg.dft@meta.data
expr<-GetAssayData(object = seuItg.dft,slot = "counts" ,assay = "RNA")

It still does not work. Should I add "{{}}" for cell_type, so that dplyr::group_by can found it in meta.

  DE %<>%
    # calculate adjusted p values
    //group_by({{cell_type}}) %>%
    mutate(p_val_adj = p.adjust(p_val, method = 'BH')) %>%
    # make sure gene is a character not a factor
    mutate(gene = as.character(gene)) %>%
    # invert logFC to match Seurat level coding
    mutate(avg_logFC = avg_logFC * -1) %>%
    dplyr::select(cell_type,
                  gene,
                  avg_logFC,
                  p_val,
                  p_val_adj,
                  de_family,
                  de_method,
                  de_type
    ) %>%
    ungroup() %>%
    arrange(cell_type, gene)
jordansquair commented 1 year ago

This is a generic error most likely to do with an issue in your data structure - it is not a bug in libra. I'd be happy to check but please send a subset of your data (e.g., all cells with just 100 genes) to my email. Thanks.

Xicici-Yan commented 1 year ago

I have sent test data. Thanks for your kind help.

HelloWorldLTY commented 1 year ago

Hi, I addressed this error by setting the cell type column as our desiered in the metadata part, and it seems that we need to make sure that in both HC and diseased cases, we need to include such cell types.