mortazavilab / PyWGCNA

PyWGCNA is a Python package designed to do Weighted Gene Correlation Network analysis (WGCNA)
https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad415/7218311
MIT License
209 stars 48 forks source link

Error in findModules() #54

Closed Reese-Martin closed 1 year ago

Reese-Martin commented 1 year ago

Code:

`import PyWGCNA gene_exp = 'cntrl_data_T.csv' pyWGCNA_cntrl = PyWGCNA.WGCNA(name='Control', species='homo sapiens', geneExpPath=gene_exp, outputPath='', save=True)

pyWGCNA_cntrl.preprocess()

pyWGCNA_cntrl.findModules()`

When trying to execute the findModules() analysis, my code errors out with the following message

Calculating 75 module eigengenes in given set... ..principal component calculation for module antiquewhite failed with the following error: ..hub genes will be used instead of principal components. Traceback (most recent call last): File "C:\Program Files\JetBrains\PyCharm 2023.1.3\plugins\python\helpers\pydev\pydevconsole.py", line 392, in do_exit import java.lang.System File "C:\Program Files\JetBrains\PyCharm 2023.1.3\plugins\python\helpers\pydev_pydev_bundle\pydev_import_hook.py", line 21, in do_import module = self._system_import(name, *args, **kwargs) ModuleNotFoundError: No module named 'java'

During handling of the above exception, another exception occurred: Traceback (most recent call last): File "...\Code\venv\lib\site-packages\IPython\core\interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in pyWGCNA_cntrl.findModules() File "...\Code\venv\lib\site-packages\PyWGCNA\wgcna.py", line 300, in findModules MEList = WGCNA.moduleEigengenes(expr=self.datExpr.to_df(), colors=self.datExpr.var['dynamicColors']) File "...\Code\venv\lib\site-packages\PyWGCNA\wgcna.py", line 2036, in moduleEigengenes sys.exit("Error!") File "C:\Program Files\JetBrains\PyCharm 2023.1.3\plugins\python\helpers\pydev\pydevconsole.py", line 397, in do_exit os._exit(args[0]) TypeError: an integer is required (got type str)

This error occurs in pyWGCNA version 1.16.8 (the version PyCharm grabs automatically). When trying to trouble shoot, I updated to the latest releases I could find (both 1.2.3 and 1.2.4) and the process errored out even earlier at the preprocess() stage, with the following error

Traceback (most recent call last): File "...\Code\venv\lib\site-packages\IPython\core\interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns)

File "", line 1, in pyWGCNA_cntrl.preprocess()

File "...\Code\venv\lib\site-packages\PyWGCNA\wgcna.py", line 227, in preprocess plt.savefig(f"{self.outputPath}/figures/sample_clustering_cleaning.{self.figureType}")

File "...\Code\venv\lib\site-packages\matplotlib\pyplot.py", line 1023, in savefig res = fig.savefig(*args, **kwargs)

File "...\Code\venv\lib\site-packages\matplotlib\figure.py", line 3343, in savefig self.canvas.print_figure(fname, **kwargs)

File "...\Code\venv\lib\site-packages\matplotlib\backend_bases.py", line 2366, in print_figure result = print_method(

File "...\Code\venv\lib\site-packages\matplotlib\backend_bases.py", line 2232, in print_method = functools.wraps(meth)(lambda *args, **kwargs: meth(

File "...\Code\venv\lib\site-packages\matplotlib\backends\backend_pdf.py", line 2808, in print_pdf file = PdfFile(filename, metadata=metadata)

File "...\Code\venv\lib\site-packages\matplotlib\backends\backend_pdf.py", line 713, in init fh, opened = cbook.to_filehandle(filename, "wb", return_opened=True)

File "...\Code\venv\lib\site-packages\matplotlib\cbook__init__.py", line 489, in to_filehandle fh = open(fname, flag, encoding=encoding) FileNotFoundError: [Errno 2] No such file or directory: '/figures/sample_clustering_cleaning.pdf'

Any help is greatly appreciated!

nargesr commented 1 year ago

Hi,

both of the error is more like the Python error rather than the package error. but this is my idea of how to solve it

first error: "ModuleNotFoundError: No module named 'java'" you need to install this module probably (I don't know how you got it but it might be because you have a windows os or you install it through PyCharm

Second error: FileNotFoundError: [Errno 2] No such file or directory: '/figures/sample_clustering_cleaning.pdf' at it said there is no /figures directory! when you define the object you set outputPath="" which going to save your stuff in your root. give the full/correct path

Reese-Martin commented 1 year ago

You were right about the second error, I was originally working on this project in macOS and there the empty quote just created the directory in place. Thanks for the help.

Reese-Martin commented 1 year ago

I have made some progress sorting through this error, it seems that the code is erroring out at the sys.exit("Error!") in line 2009 and the extraneous Java import error is just a misleading message from PyCharm (it hijacks sys.exit, but did it poorly here). The error stems from this block of Code

            if subHubs:
                print(" ..principal component calculation for module", modulename,
                      "failed with the following error:", flush=True)
                print("     ..hub genes will be used instead of principal components.", flush=True)

                isPC[i] = False
                check = True
                try:
                    scaledExpr = pd.DataFrame(scale(datModule.T).T, index=datModule.index,
                                              columns=datModule.columns)
                    covEx = np.cov(scaledExpr)
                    covEx[not np.isfinite(covEx)] = 0
                    modAdj = np.abs(covEx) ** softPower
                    kIM = (modAdj.mean(axis=0)) ** 3
                    if np.max(kIM) > 1:
                        kIM = kIM - 1
                    kIM[np.where(kIM is None)] = 0
                    hub = np.argmax(kIM)
                    alignSign = np.sign(covEx[:, hub])
                    alignSign[np.where(alignSign is None)] = 0
                    isHub[i] = True
                    tmp = np.array(kIM * alignSign)
                    tmp.shape = scaledExpr.shape
                    pcxMat = scaledExpr * tmp / sum(kIM)
                    pcx = pcxMat.mean(axis=0)
                    varExpl[0, i] = np.mean(np.corrcoef(pcx, datModule.transpose()) ** 2)
                    pc = pcx
                except:
                    check = False
        if not check:
            if not trapErrors:
                sys.exit("Error!")

I am currently trying to see which part of the try block leads to the error, but if you have any suggestions of what could be going wrong in this section any help is appreciated.

For context I encounter the same error using both my own data and the tutorial data provided in the quick start guide.

Update: after more testing, the error is occurring in this line covEx[not np.isfinite(covEx)] = 0 with the output TypeError: an integer is required (got type str)

So it seems like "not np.isfinite(covEX)" is returning a string instead of an integer, any ideas as to why this might be happening?

Update 2: after checking through the dependencies, I wound that pandas was significantly ahead of the required verison (2 vs 1.4.4), after reverting the issue was solved and findModules() successfully ran