mortazavilab / PyWGCNA

PyWGCNA is a Python package designed to do Weighted Gene Correlation Network analysis (WGCNA)
https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad415/7218311
MIT License
192 stars 46 forks source link

Multiple errors running under DOMINO #58

Closed jumoline closed 10 months ago

jumoline commented 11 months ago

HI, I'm running the tutorial quickstart notebook under Domino. I installed pip install PyWGCNA==1.2.4 --user with all the requirement versions (as listed in the requirements.txt). after running "pyWGCNA_5xFAD.preprocess()" I get error:

FileNotFoundError Traceback (most recent call last)

in ----> 1 pyWGCNA_5xFAD.preprocess() ~/.local/lib/python3.8/site-packages/PyWGCNA/wgcna.py in preprocess(self) 225 plt.tight_layout() 226 if self.save: --> 227 plt.savefig(f"{self.outputPath}/figures/sample_clustering_cleaning.{self.figureType}") 228 229 # Determine cluster under the line ~/.local/lib/python3.8/site-packages/matplotlib/pyplot.py in savefig(*args, **kwargs) 1021 def savefig(*args, **kwargs): 1022 fig = gcf() -> 1023 res = fig.savefig(*args, **kwargs) 1024 fig.canvas.draw_idle() # Need this if 'transparent=True', to reset colors. 1025 return res ~/.local/lib/python3.8/site-packages/matplotlib/figure.py in savefig(self, fname, transparent, **kwargs) 3341 ax.patch._cm_set(facecolor='none', edgecolor='none')) 3342 -> 3343 self.canvas.print_figure(fname, **kwargs) 3344 3345 def ginput(self, n=1, timeout=30, show_clicks=True, ~/.local/lib/python3.8/site-packages/matplotlib/backend_bases.py in print_figure(self, filename, dpi, facecolor, edgecolor, orientation, format, bbox_inches, pad_inches, bbox_extra_artists, backend, **kwargs) 2364 # force the figure dpi to 72), so we need to set it again here. 2365 with cbook._setattr_cm(self.figure, dpi=dpi): -> 2366 result = print_method( 2367 filename, 2368 facecolor=facecolor, ~/.local/lib/python3.8/site-packages/matplotlib/backend_bases.py in (*args, **kwargs) 2230 "bbox_inches_restore"} 2231 skip = optional_kws - {*inspect.signature(meth).parameters} -> 2232 print_method = functools.wraps(meth)(lambda *args, **kwargs: meth( 2233 *args, **{k: v for k, v in kwargs.items() if k not in skip})) 2234 else: # Let third-parties do as they see fit. ~/.local/lib/python3.8/site-packages/matplotlib/backends/backend_pdf.py in print_pdf(self, filename, bbox_inches_restore, metadata) 2806 file = filename._file 2807 else: -> 2808 file = PdfFile(filename, metadata=metadata) 2809 try: 2810 file.newPage(width, height) ~/.local/lib/python3.8/site-packages/matplotlib/backends/backend_pdf.py in __init__(self, filename, metadata) 711 self.original_file_like = None 712 self.tell_base = 0 --> 713 fh, opened = cbook.to_filehandle(filename, "wb", return_opened=True) 714 if not opened: 715 try: ~/.local/lib/python3.8/site-packages/matplotlib/cbook/__init__.py in to_filehandle(fname, flag, return_opened, encoding) 487 fh = bz2.BZ2File(fname, flag) 488 else: --> 489 fh = open(fname, flag, encoding=encoding) 490 opened = True 491 elif hasattr(fname, 'seek'): FileNotFoundError: [Errno 2] No such file or directory: '/figures/sample_clustering_cleaning.pdf' Then after runnign analyseWGCNA() I get error: --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in ----> 1 pyWGCNA_5xFAD.analyseWGCNA() ~/.local/lib/python3.8/site-packages/PyWGCNA/wgcna.py in analyseWGCNA(self, order, geneList, show) 372 print(f"{OKCYAN}Calculating module trait relationship ...{ENDC}") 373 --> 374 self.moduleTraitCor = pd.DataFrame(index=self.MEs.columns, 375 columns=datTraits.columns, 376 dtype="float") AttributeError: 'NoneType' object has no attribute 'columns' Please advise Regards J
nargesr commented 11 months ago

Hi,

I don't have access to Domino os but according to your first error, it seems you don't have the figures folder in your output path! do you have write access? would you mind checking if your figures folder is next to your notebook if you didn't change the quickstart notebook at all?

for the second error, did you use a pickle file in the Zendo?

jumoline commented 11 months ago

Hi, I do have write access as a user to the filesystem. and there is a figures folder. I have not modified the notebook at all. For your next question I don't know what it means. I only created my environment in conda, cloned the repository, checked the requirements and directly opened the Notebook. I ran each cell to see if its working well (without modifying anything). I also tried this in a HPC and get the same error. In the next cell: geneList = PyWGCNA.getGeneList(dataset='mmusculus_gene_ensembl', attributes=['ensembl_gene_id', 'external_gene_name', 'gene_biotype'], maps=['gene_id', 'gene_name', 'gene_biotype'])

pyWGCNA_5xFAD.updateGeneInfo(geneList)

I get error:

AttributeError Traceback (most recent call last) /tmp/ipykernel_3966928/1745806926.py in ?() ----> 1 geneList = PyWGCNA.getGeneList(dataset='mmusculus_gene_ensembl', 2 attributes=['ensembl_gene_id', 3 'external_gene_name', 4 'gene_biotype'],

/expanse/projects/janssen4/dsci-csb/jmolineros/conda/install/envs/PyWGCNA/lib/python3.11/site-packages/PyWGCNA/utils.py in ?(dataset, attributes, maps, server_domain) 125 line = line.split('\t') 126 dict = {} 127 for i in range(len(attributes)): 128 dict[attributes[i]] = line[i] --> 129 geneInfo = geneInfo.append(dict, ignore_index=True) 130 131 geneInfo.index = geneInfo[attributes[0]] 132 geneInfo.drop(attributes[0], axis=1, inplace=True)

/expanse/projects/janssen4/dsci-csb/jmolineros/conda/install/envs/PyWGCNA/lib/python3.11/site-packages/pandas/core/generic.py in ?(self, name) 5985 and name not in self._accessors 5986 and self._info_axis._can_hold_identifiers_and_holds_name(name) 5987 ): 5988 return self[name] -> 5989 return object.getattribute(self, name)

AttributeError: 'DataFrame' object has no attribute 'append' --> I think this is because of pandas version

And the next: pyWGCNA_5xFAD.analyseWGCNA()

AttributeError Traceback (most recent call last) Cell In[8], line 1 ----> 1 pyWGCNA_5xFAD.analyseWGCNA()

File /expanse/projects/janssen4/dsci-csb/jmolineros/conda/install/envs/PyWGCNA/lib/python3.11/site-packages/PyWGCNA/wgcna.py:374, in WGCNA.analyseWGCNA(self, order, geneList, show) 370 datTraits = self.getDatTraits(self.datExpr.obs.columns.tolist()) 372 print(f"{OKCYAN}Calculating module trait relationship ...{ENDC}") --> 374 self.moduleTraitCor = pd.DataFrame(index=self.MEs.columns, 375 columns=datTraits.columns, 376 dtype="float") 377 self.moduleTraitPvalue = pd.DataFrame(index=self.MEs.columns, 378 columns=datTraits.columns, 379 dtype="float") 381 for i in self.MEs.columns:

AttributeError: 'NoneType' object has no attribute 'columns'

The data files I downloaded for use were in 5xFAD_paper.zip

nargesr commented 11 months ago

If you couldn't perform the first function (pyWGCNA_5xFAD.preprocess()), you definitely couldn't perform the rest of the function correctly!

which version of pyWGCNA and pandas do you have?

jumoline commented 11 months ago

I re-installed and was able to create plots. Not the QuickStart notebook gets stuck at command: pyWGCNA_5xFAD.findModules()

The output produced is: Run WGCNA... pickSoftThreshold: calculating connectivity for given powers... will use block size 1876 Power SFT.R.sq slope truncated R.sq mean(k) median(k) \ 0 1 0.368857 -0.481613 0.701585 2444.750756 2260.416614
1 2 0.7253 -0.99165 0.886361 840.665489 673.081241
2 3 0.791986 -1.194264 0.946969 385.685335 258.451265
3 4 0.835392 -1.3419 0.968446 207.404152 113.456087
4 5 0.853842 -1.472183 0.973346 123.232581 54.784481
5 6 0.870673 -1.553348 0.979584 78.455923 28.47124
6 7 0.886736 -1.600869 0.986635 52.572016 15.594822
7 8 0.896672 -1.639343 0.992373 36.65884 9.454046
8 9 0.903531 -1.677747 0.994643 26.397061 6.024431
9 10 0.906045 -1.706474 0.995895 19.521431 3.975959
10 11 0.905582 -1.731076 0.994806 14.767291 2.623921
11 13 0.914482 -1.751347 0.997466 8.941254 1.205108
12 15 0.912684 -1.771227 0.994189 5.759987 0.568044
13 17 0.912188 -1.774908 0.990829 3.905403 0.273242
14 19 0.907649 -1.774186 0.989457 2.766824 0.135454

     max(k)  

0 5665.102661
1 3009.058821
2 1916.810605
3 1332.762771
4 984.036824
5 752.959999
6 591.514192
7 475.817182
8 389.237531
9 322.823838
10 270.867416
11 196.222414
12 146.575349
13 112.189052
14 87.594344
Selected power to have scale free network is 9. calculating adjacency matrix ... Done..

calculating TOM similarity matrix ... Done..

Going through the merge tree... ..cutHeight not given, setting it to 0.996 ===> 99% of the (truncated) height range in dendro. Done..

Calculating 22 module eigengenes in given set... ..principal component calculation for module black failed with the following error: ..hub genes will be used instead of principal components. And then error:

An exception has occurred, use %tb to see the full traceback. SystemExit: Error!

nargesr commented 11 months ago

which version of pyWGCNA and pandas do you have??

also, it's worth looking at issue #56 or checking the version of your packages and trying to match them to requirements.txt

jumoline commented 11 months ago

I re-installed pandas to version 1.4.4. and so far is running. For some reason pip install pyWGCNA read the requirements.txt and kept installing pandas 2. i had to explicitly re-install pandas, restart the kernel and reload everything. Now everything works in the Notebook. The only thing I added to the Notebook now is a sanity check: import pandas as pd print(pd.version) Thanks J

nargesr commented 11 months ago

Great! yeah even though I specify the version of pandas in setup.py, sometimes it needs to re-install it again :(

is it also working under Domino os?