Closed kvshams closed 1 year ago
Hi Shams,
my apologies for the late response. Yes, you can use an entire pathway database like the C2 bundle from MSigDB. The important thing is you format the gene set annotation dictionary correctly.
The dictionary has to include all cell types from your adata
cell type annotations as keys. Since most databases will not give you annotations which cell types their gene sets are specific to, you will have to 1) either annotate the cell types yourself or 2) set all gene sets as global (both approaches should be fine you can look empirically what works for you).
gene_set_dictionary = {'celltype_1':{'gene_set_1':['gene_a', 'gene_b', 'gene_c'], 'gene_set_2':['gene_c','gene_a','gene_e','gene_f']},
'celltype_2':{'gene_set_1':['gene_a', 'gene_b', 'gene_c'], 'gene_set_3':['gene_a', 'gene_e','gene_f','gene_d']},
'celltype_3':{},
'global':"{'gene_set_4':['gene_m','gene_n']}
Having said that, we believe that best results can be obtained by limiting the number of gene sets to coherent interpretable genes of similar size and with limited redundancy (please see the manuscript Supplementary Methods for further detail https://doi.org/10.1101/2022.12.20.521311 ). We also offer a package to select gene sets for Spectra which we will update with an extended set of annotations (including cancer cell and stroma cell gene sets) in the near future https://github.com/wallet-maker/cytopus .
Let me know if that helps
Thanks for the reply. Is there an example code snippet format the jason file from MSiGDB? Thanks, Shams
Hi Shams,
we do not provide a code snippet, but you will find an explanation in the tutorial how to configure the dictionary. The easiest way would be to run this will use_celltype=False
in the est_spectra
function. We now provide an example in the tutorial.
https://github.com/dpeerlab/spectra/blob/main/notebooks/example_notebook.ipynb
Thank you, Thomas
What annotation format is required? Is it possible to use the gene sets directly from the pathway database? for instances the C2
jason
bundle from the Broad Institute pathway database? Thanks, Shams