mortazavilab / PyWGCNA

PyWGCNA is a Python package designed to do Weighted Gene Correlation Network analysis (WGCNA)
https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad415/7218311
MIT License
206 stars 48 forks source link

Error in documentation and in cutreeHybrid #112

Open Buiboni opened 1 month ago

Buiboni commented 1 month ago

Hi, thank you for your package! I tried to use it but I came across several issues: in the documentation: the WGCNA object is said to be initialized with geneExp as: geneExp (pandas dataframe) – expression matrix which genes are in the rows and samples are columns however most of the functions use the data as data – expression data in a matrix or data frame. Rows correspond to samples and columns to genes., and in the code findModules() the data is passed as self.datExpr.to_df(). I guess the documentation of WGCNA is wrong since even in your tutorial the columns are genes, and the rows are samples. in the code: in the function cutreeHybbrid() line 1752 InCluster = list(range(nPoints))[SmallLabels == sclust] raise an error. I think it is because a list cannot be indexed by something else than an int. However I don't really understand that line so I couldn't try and fix it... If InCluster is a list of boolean representing if the gene is in the cluster or not then shouldnt 'SmallLabels == sclust' be enought? When I try to do so, I got an error line 1761...

On an other note, do you plan to add other types of correlation?

Again thank you for your work!

nargesr commented 1 month ago

Hi @Buiboni

in your first question regarding the documentation, I will fix the API documentation in the next release. I forgot to fix it while I was fixing the rest of the documentation. Thank you for catching it.

Regarding your second question, could you share with me the whole error and if you made any modifications to the default input? (if you can share with me the script that you used, that would be great)

I don't have any plan to add new stuff to the package besides maintaining it but if you are willing to add new stuff, I'm happy to help you. and merge it into the current package.

Thank you for using PyWGCNA:)

Buiboni commented 1 month ago

Here is a script that would produce the error, with the data from the tutorial:

import pandas as pd
import PyWGCNA
data = pd.read_csv('expressionList.csv',index_col=0)
pyWGCNA_5xFAD = PyWGCNA.WGCNA(name='5xFAD',
                              species='mus musculus', 
                              geneExp=data.iloc[:,:200], 
                              outputPath='',
                              save=True)
pyWGCNA_5xFAD.preprocess(show=False)
pyWGCNA_5xFAD.findModules({'cutreeHybrid':{'pamRespectsDendro':True, 
                                      'respectSmallClusters':True}})

The error you get is:

File ~/dev/miniconda3/envs/analysis/lib/python3.12/site-packages/PyWGCNA/wgcna.py:1751 in cutreeHybrid 
                  InCluster = list(range(nPoints))[SmallLabels == sclust]
TypeError: only integer scalar arrays can be converted to a scalar index

I am working on a patch to add Spearman correlation. I will let you know when it is done!