drieslab / Giotto

Spatial omics analysis toolbox
https://drieslab.github.io/Giotto_website/
Other
258 stars 98 forks source link

Identifying DEGs between two selected sets of spots after integration #187

Open moutazhelal opened 2 years ago

moutazhelal commented 2 years ago

Dear Giotto team,

I am trying to Identify DEGs between two selected sets of spots after integration of two Giotto objects I selected the spots that belonged to the same leiden cluster but they are located in two different capture areas using this:

pData = pDataDT(combo_LN)
pData_NDLN = pData[list_ID  == "NDLN"]
pData_TDLN = pData[list_ID  == "TDLN"]
Tcell_NDLN =  pData_NDLN[leiden_harmony == "5" | leiden_harmony == "6"|leiden_harmony == "7"|leiden_harmony == "8"] 
Tcell_TDLN = pData_TDLN[leiden_harmony == "5" | leiden_harmony == "6"|leiden_harmony == "7"|leiden_harmony == "8"]

and ran this function:

Scran_TcellsTDLN_vs_NDLN = findMarkers(combo_LN,expression_values = "normalized",
  cluster_column = 'leiden_clus',method = "scran",
  group_1 = Tcell_TDLN$cell_ID,
  group_2 = Tcell_NDLN$cell_ID,
  min_feats = 4)

and got this error

Error in .setup_groups(groups, x, restrict = restrict, exclude = exclude) : need at least two unique levels in 'groups'

findMarkers need to compare two groups that are found in cluster_column . would it be possible to compare two groups based on spot ID ?

Best, Moutaz

RubD commented 2 years ago

Hi @moutazhelal , I believe that if you change the cluster_column parameter to 'cell_ID', then it will work the way you want it to. Right now it's looking for the cell ids in the 'leiden_clus' column to make different groups, but since you have made your own groups and identified the cell ids, you want to also specify that. Let me know if it doesn't work.

moutazhelal commented 2 years ago

Hi @RubD , Thank you very much for your help and responsiveness.

I tried what you suggested

Scran_TcellsTDLN_vs_NDLN = findMarkers(combo_LN,expression_values = "normalized",
                                       cluster_column = 'cell_ID',method = "scran",
                                       group_1 = Tcell_TDLN$cell_ID,
                                       group_2 = Tcell_NDLN$cell_ID,
                                       min_feats = 4)

I ended up with this error

Error in do.call(data.frame, c(df_list, list(row.names = row.names, check.names = !optional, : variable names are limited to 10000 bytes.

I have also tried to add a column to the metadata to define my comparison which I added the location to the cluster to be able to compare them as follows
combo_LN@cell_metadata[["cell"]][["rna"]]$loc_Clus =as.factor( paste(combo_LN@cell_metadata[["cell"]][["rna"]]$list_ID, "_", combo_LN@cell_metadata[["cell"]][["rna"]]$leiden_harmony))

but when I ran

Scran_TcellsTDLN_vs_NDLN = findMarkers(combo_LN,expression_values = "normalized",
  cluster_column = 'loc_Clus',method = "scran",
  group_1 = "TDLN_5", group_2 = "NDLN_7",min_feats = 4)

I had the same error as the first time

Error in .setup_groups(groups, x, restrict = restrict, exclude = exclude) : need at least two unique levels in 'groups'

Best, Moutaz

RubD commented 2 years ago

Hi @moutazhelal I took a look and I believe this should be fixed and doable right now. You will need to specify the group_1_name and group_2_name parameters.

Previously, the group names were automatically build from the cluster column, this isn't a big problem when you select based on leiden clusters etc, but when you have 1000s of individual cell ids, then a combined group name string becomes too long.