mortazavilab / PyWGCNA

PyWGCNA is a Python package designed to do Weighted Gene Correlation Network analysis (WGCNA)
https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad415/7218311
MIT License
217 stars 53 forks source link

Seeking help on `order` Parameter Usage and Metadata Color Settings #92

Open KGZaker opened 8 months ago

KGZaker commented 8 months ago

Firstly, thanks so much for developing such a useful tool. It has been immensely helpful in my research. However, I've encountered a couple of challenges that I hope you could assist me with:

  1. I came across the following description in the Data_format.md:

    The sample metadata is a table which contains additional information about each sample, such as timepoint or genotype. Each row should represent a sample and each column should represent a metadata feature, where the first column contains the same sample identifier that was used in the gene expression matrix. The rows should be in the same order as the rows of the gene expression matrix, or the user can specify order=False.

    Based on this: if the rows of the sample metadata and the rows of the gene expression matrix are not in the same order, or if the columns of the gene metadata and the rows of the gene expression matrix are not in the same order, in which function should I specify order=False? Should it be in pyWGCNA.analyseWGCNA() or do I need to specify it in other functions like PyWGCNA.WGCNA, pyWGCNA.preprocess(), or pyWGCNA.findModules()? I tried using pyWGCNA.analyseWGCNA(order=False), but I encountered the following error: TypeError: 'bool' object is not iterable.

  2. It appears that I need to first specify colors using pyWGCNA.setMetadataColor before I can correctly call pyWGCNA.analyseWGCNA(). However, in the step of pyWGCNA.setMetadataColor(), does it only support categorical variables and not continuous variables, which means I need to convert continuous variables into categorical ones?

Thank you sooooo much for your time and assistance.

nargesr commented 8 months ago

Hi, Thank you for your kind comments:)

  1. you need to specify this when you update your metadata. but I just realized I updated it in a way you don’t need to specify it. the only key import thing is that the index should match and it will update it based on the index for more information look at the updateGeneInfo() and updateSampleInfo().

I will update the Data_format in the next week to reflect changes (I will notify you through this issue).

thanks for catching this :)

  1. if you have a continuous value you can pass the color as a color palette using matplotlib.cm.ScalarMappable.
KGZaker commented 8 months ago

Thanks for your quick reply. Sorry, I may still have a naive question😂. Regarding the second solution, I attempted the following codes, but encountered an error: AttributeError: 'ScalarMappable' object has no attribute 'keys'.

import matplotlib as mpl
norm = mpl.colors.Normalize(vmin=min(adata.obs['age']), 
                            vmax=max(adata.obs['age']))
cmap = mpl.colormaps['viridis']
pyWGCNA.setMetadataColor('age', mpl.cm.ScalarMappable(norm, cmap))

pyWGCNA.analyseWGCNA(geneList=adata.var)
nargesr commented 8 months ago

Hi @KGZaker,

I believe I used the same script! lol can you send me the full error?

import matplotlib as mpl

norm = mpl.colors.Normalize(vmin=adata.obs['age'].min(), 
                            vmax=adata.obs['age'].max())
my_palette_age = mpl.cm.ScalarMappable(norm=norm, cmap='viridis')
KGZaker commented 8 months ago

Hi, thanks for helping me figure out what's wrong. Here is the code I used and errors I got:

pyWGCNA.updateSampleInfo(adata.obs[["sex", "age"]])

norm = mpl.colors.Normalize(vmin=adata.obs['age'].min(),  vmax=adata.obs['age'].max())
my_palette_age = mpl.cm.ScalarMappable(norm=norm, cmap='viridis')

pyWGCNA.setMetadataColor('age', my_palette_age)
pyWGCNA.barplotModuleEigenGene(moduleName="white", metadata=["age"])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[229], line 1
----> 1 pyWGCNA.barplotModuleEigenGene(moduleName="white", metadata=["age"])

File [~/micromamba/envs/macbiom/lib/python3.11/site-packages/PyWGCNA/wgcna.py:3028](http://localhost:8889/~/micromamba/envs/macbiom/lib/python3.11/site-packages/PyWGCNA/wgcna.py#line=3027), in WGCNA.barplotModuleEigenGene(self, moduleName, metadata, combine, colorBar, show)
   3026 height_ratios = []
   3027 for m in metadata:
-> 3028     height_ratios.append(len(list(self.metadataColors[m].keys())))
   3029 height_ratios.reverse()
   3031 modules = np.unique(self.datExpr.var['moduleColors']).tolist()

AttributeError: 'ScalarMappable' object has no attribute 'keys'
nargesr commented 8 months ago

Hi @KGZaker

I just released the new version (2.0.4) which hopefully will solve your problem. Please upgrade the PyWGCNA and let me know if you still have the same problem

KGZaker commented 8 months ago

Hi @nargesr , thanks so much for your time and kind help. Now everything works well in the new version.

realzhipeng commented 5 months ago

Hello @nargesr . I also want to setup this one. But I still meet the error: "'ScalarMappable' object has no attribute 'keys'". And I'm using 2.0.4 the new version. How to get it corrected?

nargesr commented 5 months ago

Hi @realzhipeng

could you please provide me with the script you used and the full error you got?

Thanks

realzhipeng commented 5 months ago

@nargesr , Yes. Actually, I'm using the above code that you shared with KGZaker.

pyWGCNA_datset.setMetadataColor('Sex',` {'F': 'Pink',
                                       'M': 'blue'})
norm = mpl.colors.Normalize(vmin=df_sampleids_expr['Age'].min(),  vmax=df_sampleids_expr['Age'].max())

my_palette_age = mpl.cm.ScalarMappable(norm=norm, cmap='viridis')

pyWGCNA_datset.setMetadataColor("Age",my_palette_age)

Additionally, df_sampleids_expr is: image image

And the error after run pyWGCNA_datset.analyseWGCNA(): image

Let me know if you need the more detail.

nargesr commented 5 months ago

Hi @realzhipeng

Please install a new version of PyWGCNA (2.0.5).

I also wanted to note that your dtypes should be numbers for that metadata that you have continuous values.

Here's an example related to the tutorial

import PyWGCNA
pyWGCNA_5xFAD = PyWGCNA.readWGCNA("5xFAD.p")
import matplotlib as mpl
pyWGCNA_5xFAD.datExpr.obs['Timepoint'] = pyWGCNA_5xFAD.datExpr.obs.Age
pyWGCNA_5xFAD.datExpr.obs['Timepoint'].replace({'4mon': '4',
                                                '8mon': '8',
                                                '12mon': '12',
                                                '18mon': '18'}, inplace=True)
pyWGCNA_5xFAD.datExpr.obs['Timepoint'] = pyWGCNA_5xFAD.datExpr.obs['Timepoint'].astype('int')
norm = mpl.colors.Normalize(vmin=pyWGCNA_5xFAD.datExpr.obs['Timepoint'].min(),  vmax=pyWGCNA_5xFAD.datExpr.obs['Timepoint'].max())

my_palette_age = mpl.cm.ScalarMappable(norm=norm, cmap='viridis')

pyWGCNA_5xFAD.setMetadataColor("Timepoint", my_palette_age)

pyWGCNA_5xFAD.plotModuleEigenGene('darkred', ['Tissue','Sex','Genotype', 'Timepoint'])

Please upgrade the PyWGCNA and let me know if you still have the same problem

realzhipeng commented 5 months ago

Hi @nargesr Could you please run "pyWGCNA_5xFAD.analyseWGCNA()" I see another error: "TypeError: Expecting 'to_replace' to be either a scalar, array-like, dict or None, got invalid type 'ScalarMappable'" as below. Notes: I've already installed all the version that you recommended and also the dependencies from the requirement.txt file. image

nargesr commented 4 months ago

Hi @realzhipeng,

could you please share with me more details? I need the full script you used, self.datExpr.obs, and self.datExpr.obs.dtypes.

based on what you shared with me so far I wanted to mention a few things:

Your error is related to the bar plot ModuleEigenGene, not the heatmap one, so it's different from what you got before.

you also need to be careful about the order of the metadata for the bar plot version since it would group the last column of your metadata which couldn't be continuous values unless you define categories for them. I believe that

I believe if you fix the order of your metadata, you should be able to run the function without any error but I would be happy to check this on my end if you share with me the information I asked.

realzhipeng commented 4 months ago

Hi @nargesr Here is the code that I used for the error:

import os
import matplotlib as mpl

import PyWGCNA

path = r""

file_expression = r"expressionList.csv"
file_sampleinfo = r"sampleInfo.csv"
file_genelist = r"geneList.txt"

outpath = r"results"

pyWGCNA_5xFAD = PyWGCNA.WGCNA(name='5xFAD', 
                              species='mus musculus', 
                              geneExpPath=os.path.join(path,file_expression), 
                              outputPath=os.path.join(path,outpath)+"\\",
                              save=True)
pyWGCNA_5xFAD.geneExpr.to_df().head(5)

pyWGCNA_5xFAD.preprocess()

pyWGCNA_5xFAD.findModules()

pyWGCNA_5xFAD.updateSampleInfo(path=os.path.join(path,file_sampleinfo), sep=',')

pyWGCNA_5xFAD.datExpr.obs['Timepoint'] = pyWGCNA_5xFAD.datExpr.obs.Age
pyWGCNA_5xFAD.datExpr.obs['Timepoint'].replace({'4mon': '4',
                                                '8mon': '8',
                                                '12mon': '12',
                                                '18mon': '18'}, inplace=True)
pyWGCNA_5xFAD.datExpr.obs['Timepoint'] = pyWGCNA_5xFAD.datExpr.obs['Timepoint'].astype('int')

norm = mpl.colors.Normalize(vmin=pyWGCNA_5xFAD.datExpr.obs['Timepoint'].min(),  vmax=pyWGCNA_5xFAD.datExpr.obs['Timepoint'].max())

my_palette_age = mpl.cm.ScalarMappable(norm=norm, cmap='viridis')

pyWGCNA_5xFAD.setMetadataColor("Timepoint", my_palette_age)

# add color for metadata
pyWGCNA_5xFAD.setMetadataColor('Sex', {'Female': 'green',
                                       'Male': 'yellow'})
pyWGCNA_5xFAD.setMetadataColor('Genotype', {'5xFADWT': 'darkviolet',
                                            '5xFADHEMI': 'deeppink'})
pyWGCNA_5xFAD.setMetadataColor('Age', {'4mon': 'thistle',
                                       '8mon': 'plum',
                                       '12mon': 'violet',
                                       '18mon': 'purple'})
pyWGCNA_5xFAD.setMetadataColor('Tissue', {'Hippocampus': 'red',
                                          'Cortex': 'blue'})

geneList = PyWGCNA.getGeneList(dataset='mmusculus_gene_ensembl',
                               attributes=['ensembl_gene_id', 
                                           'external_gene_name', 
                                           'gene_biotype'],
                               maps=['gene_id', 'gene_name', 'gene_biotype'])

pyWGCNA_5xFAD.updateGeneInfo(geneList)

pyWGCNA_5xFAD.analyseWGCNA()

Look forward to your further help.