Closed Lachiemckbioinfo closed 2 months ago
Hi @Lachiemckbioinfo,
Can you tell me what the output of testpyw.datExpr.var
and testpyw.datExpr.obs
?
First few rows should be fine but I need to see the all columns.
Best, Narges
This is using the expressionList data.
dynamicColors moduleColors moduleLabels gene_name gene_biotype
ENSMUSG00000000028 darkred darkred 4.0 Cdc45 protein_coding
ENSMUSG00000000049 darkred darkred 4.0 Apoh protein_coding
ENSMUSG00000000056 darkgrey darkgrey 3.0 Narf protein_coding
ENSMUSG00000000058 coral coral 2.0 Cav2 protein_coding
ENSMUSG00000000078 gainsboro gainsboro 7.0 Klf6 protein_coding
ENSMUSG00000000085 darkgrey darkgrey 3.0 Scmh1 protein_coding
Age Tissue Sex Genotype
sample_id
X4mo_cortex_F_5xFADHEMI_430 4mon Cortex Female 5xFADHEMI
X4mo_cortex_F_5xFADHEMI_431 4mon Cortex Female 5xFADHEMI
X4mo_cortex_F_5xFADHEMI_433 4mon Cortex Female 5xFADHEMI
X4mo_cortex_F_5xFADHEMI_434 4mon Cortex Female 5xFADHEMI
X4mo_cortex_F_5xFADHEMI_511 4mon Cortex Female 5xFADHEMI
X4mo_cortex_F_5xFADWT_330 4mon Cortex Female 5xFADWT
Thanks, Lachlan
this is coming from the example dataset, right?
if yes, the script you provided seems to be something different since the column names you used (Subset
, Treatment
) do not exist in your dataset (datExpr.obs
).
Could you provide the script, self.datExpr.var
, self.datExpr.obs
, and the error you got for either the example dataset or your dataset?
Thanks, Narges
Ah, sorry, mixed up my scripts.
This is the script running with the example data:
import PyWGCNA as pyw
# Add function for custom printing with linebreaks
def customprint(*args):
linebreak = "\n------------------------------\n"
print(f"{linebreak}")
for arg in args:
print(arg)
print(f"{linebreak}")
# Can read WGCNA objects
#testpyw = pyw.readWGCNA('filename')
geneExp = 'data/expressionList.csv'
testpyw = pyw.WGCNA(name = 'WGCNA',
species = 'Mus musculus',
geneExpPath=geneExp,
outputPath='outdir/', #In order for the output path to act as a out directory, end string with '/'
save=True)
#print('Starting preprocessing')
#testpyw.preprocess()
#testpyw.findModules()
testpyw.runWGCNA() # Single command to do both .preprocess() and .findModules()
testpyw.updateSampleInfo(path='data/sampleInfo.csv', sep=',')
# Add colour for metadata
testpyw.setMetadataColor('Sex', {'Female': 'green',
'Male': 'yellow'})
testpyw.setMetadataColor('Genotype', {'5xFADWT': 'darkviolet',
'5xFADHEMI': 'deeppink'})
testpyw.setMetadataColor('Tissue', {'Hippocampus': 'red',
'Cortex': 'blue'})
testpyw.setMetadataColor('Age', {'4mon': 'thistle',
'8mon': 'plum',
'12mon': 'violet',
'18mon': 'purple'})
# Update gene information using geneList
geneList = pyw.getGeneList(dataset = 'mmusculus_gene_ensembl',
attributes=['ensembl_gene_id',
'external_gene_name',
'gene_biotype'],
maps=['gene_id', 'gene_name', 'gene_biotype'])
testpyw.updateGeneInfo(geneList)
testpyw.saveWGCNA()
# Set figure output format
#testpyw.figureType = 'png'
customprint(f"datExpr.var:\n {testpyw.datExpr.var.head(6)}", f"datExpr.obs:\n{testpyw.datExpr.obs.head(6)}")
testpyw.analyseWGCNA()
# Function to group GO, KEGG and Reactome annotation and perform as one block
def funcAnnot():
# GO Annotation
gene_set_library = ["GO_Biological_Process_2021", "GO_Cellular_Component_2021", "GO_Molecular_Function_2021"]
testpyw.functional_enrichment_analysis(type="GO",
moduleName='lightgrey',
sets=gene_set_library,
p_value=0.05,
file_name="GO_coral_2021")
# KEGG Annotation
KEGG_set_library = ["KEGG_2016"]
testpyw.functional_enrichment_analysis(type='KEGG',
moduleName='lightgrey',
sets=KEGG_set_library,
p_value=0.05)
# Reactome annotation
testpyw.functional_enrichment_analysis(type='REACTOME',
moduleName='lightgrey',
p_value=0.05)
#funcAnnot()
def modulenetwork():
modules = testpyw.datExpr.var.moduleColors.unique().tolist()
print(f"Modules: {modules}")
testpyw.CoexpressionModulePlot(modules=modules, numGenes=10, numConnections=100, minTOM=0)
#modulenetwork()
Hi @Lachiemckbioinfo
Unfortunately, I wasn't able to reproduce your error using the example datasets and script you provided.
I did update requirements.txt so you can compare the versioning.
The only package that I thought might be a problem was matplotlib but when I checked the documentation of gridspec.GridSpecFromSubplotSpec()
for both versions of 3.8 and 3.9, I didn't detect any changes.
According to the error, it seems you have Python 3.9 but to install PyWGCNA, Python version 3.10 or greater is required (ref).
I would suggest making a new environment with Python 3.10, installing PyWGCNA, and then checking the version of dependencies. Hopefully, that will fix your problem.
Hi, Managed to get it solved by taking it to a virtual machine. For context, I was running it on a university Linux server with Python 3.9 pre-installed, but was doing Python versioning using a Conda environment and pip installs in a Python virtual environment, so there may have been conflicting versions going on.
Thanks, Lachlan
Hi, I had the same issue again, running this in a Conda environment with matplotlib 3.9.1. However, reinstalling matplotlib to 3.8.2 resolved the issue.
Hi @Lachiemckbioinfo
which version of Python? also, can you send me the version of all the packages you are using? also, did you get the same error?
Yes, it was the exact same error. Here are the details:
Python version: 3.10.14 Environment: Conda environment. Command: conda create --name wgtest python=3.10 WGCNA install method: pip install PyWGCNA Resolved with: pip install 'matplotlib==3.8.2' --force-reinstall
Traceback (most recent call last):
File "/home/ljm028/WGCNA/AtaSC/No_Nulls/take3/test/WGCNA_group.py", line 161, in <module>
run_WGCNA('Culturing', 'data/no_Nulls_expressionData_Culturing.csv', samplefile, 'Culturing')
File "/home/ljm028/WGCNA/AtaSC/No_Nulls/take3/test/WGCNA_group.py", line 79, in run_WGCNA
pyw.analyseWGCNA()
File "/home/ljm028/.conda/envs/wgtest/lib/python3.10/site-packages/PyWGCNA/wgcna.py", line 452, in analyseWGCNA
self.plotModuleEigenGene(module, metadata, show=show)
File "/home/ljm028/.conda/envs/wgtest/lib/python3.10/site-packages/PyWGCNA/wgcna.py", line 2980, in plotModuleEigenGene
axs_legend = gridspec.GridSpecFromSubplotSpec(len(metadata), 1, subplot_spec=ax_legend,
File "/home/ljm028/.conda/envs/wgtest/lib/python3.10/site-packages/matplotlib/gridspec.py", line 490, in __init__
raise TypeError(
TypeError: subplot_spec must be type SubplotSpec, usually from GridSpec, or axes.get_subplotspec.
packages in environment at /home/ljm028/.conda/envs/wgtest:
Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu anndata 0.10.8 pypi_0 pypi array-api-compat 1.7.1 pypi_0 pypi asttokens 2.4.1 pypi_0 pypi biomart 0.9.2 pypi_0 pypi bzip2 1.0.8 h5eee18b_6 ca-certificates 2024.7.2 h06a4308_0 certifi 2024.7.4 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi contourpy 1.2.1 pypi_0 pypi cycler 0.12.1 pypi_0 pypi decorator 5.1.1 pypi_0 pypi exceptiongroup 1.2.2 pypi_0 pypi executing 2.0.1 pypi_0 pypi fonttools 4.53.1 pypi_0 pypi gseapy 1.1.3 pypi_0 pypi h5py 3.11.0 pypi_0 pypi idna 3.7 pypi_0 pypi ipython 8.26.0 pypi_0 pypi jedi 0.19.1 pypi_0 pypi jinja2 3.1.4 pypi_0 pypi joblib 1.4.2 pypi_0 pypi json5 0.9.25 pypi_0 pypi jsonpickle 3.2.2 pypi_0 pypi kiwisolver 1.4.5 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 libffi 3.4.4 h6a678d5_1 libgcc-ng 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libstdcxx-ng 11.2.0 h1234567_1 libuuid 1.41.5 h5eee18b_0 markupsafe 2.1.5 pypi_0 pypi matplotlib 3.9.1 pypi_0 pypi matplotlib-inline 0.1.7 pypi_0 pypi memoir 0.0.3 pypi_0 pypi natsort 8.4.0 pypi_0 pypi ncurses 6.4 h6a678d5_0 networkx 3.3 pypi_0 pypi numpy 2.0.0 pypi_0 pypi openssl 3.0.14 h5eee18b_0 packaging 24.1 pypi_0 pypi pandas 2.2.2 pypi_0 pypi parso 0.8.4 pypi_0 pypi patsy 0.5.6 pypi_0 pypi pexpect 4.9.0 pypi_0 pypi pillow 10.4.0 pypi_0 pypi pip 24.0 py310h06a4308_0 prompt-toolkit 3.0.47 pypi_0 pypi psutil 6.0.0 pypi_0 pypi ptyprocess 0.7.0 pypi_0 pypi pure-eval 0.2.2 pypi_0 pypi pygments 2.18.0 pypi_0 pypi pyparsing 3.1.2 pypi_0 pypi python 3.10.14 h955ad1f_1 python-dateutil 2.9.0.post0 pypi_0 pypi pytz 2024.1 pypi_0 pypi pyvis 0.3.1 pypi_0 pypi pywgcna 2.0.5 pypi_0 pypi reactome2py 3.0.0 pypi_0 pypi readline 8.2 h5eee18b_0 reprit 0.9.0 pypi_0 pypi requests 2.32.3 pypi_0 pypi rsrc 0.1.3 pypi_0 pypi scikit-learn 1.5.1 pypi_0 pypi scipy 1.14.0 pypi_0 pypi seaborn 0.13.2 pypi_0 pypi setuptools 69.5.1 py310h06a4308_0 six 1.16.0 pypi_0 pypi sqlite 3.45.3 h5eee18b_0 stack-data 0.6.3 pypi_0 pypi statsmodels 0.14.2 pypi_0 pypi threadpoolctl 3.5.0 pypi_0 pypi tk 8.6.14 h39e8969_0 traitlets 5.14.3 pypi_0 pypi typing-extensions 4.12.2 pypi_0 pypi tzdata 2024.1 pypi_0 pypi urllib3 2.2.2 pypi_0 pypi wcwidth 0.2.13 pypi_0 pypi wheel 0.43.0 py310h06a4308_0 xz 5.4.6 h5eee18b_1 zlib 1.2.13 h5eee18b_1
Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu anndata 0.10.8 pypi_0 pypi array-api-compat 1.7.1 pypi_0 pypi asttokens 2.4.1 pypi_0 pypi biomart 0.9.2 pypi_0 pypi bzip2 1.0.8 h5eee18b_6 ca-certificates 2024.7.2 h06a4308_0 certifi 2024.7.4 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi contourpy 1.2.1 pypi_0 pypi cycler 0.12.1 pypi_0 pypi decorator 5.1.1 pypi_0 pypi exceptiongroup 1.2.2 pypi_0 pypi executing 2.0.1 pypi_0 pypi fonttools 4.53.1 pypi_0 pypi gseapy 1.1.3 pypi_0 pypi h5py 3.11.0 pypi_0 pypi idna 3.7 pypi_0 pypi ipython 8.26.0 pypi_0 pypi jedi 0.19.1 pypi_0 pypi jinja2 3.1.4 pypi_0 pypi joblib 1.4.2 pypi_0 pypi json5 0.9.25 pypi_0 pypi jsonpickle 3.2.2 pypi_0 pypi kiwisolver 1.4.5 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 libffi 3.4.4 h6a678d5_1 libgcc-ng 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libstdcxx-ng 11.2.0 h1234567_1 libuuid 1.41.5 h5eee18b_0 markupsafe 2.1.5 pypi_0 pypi matplotlib 3.8.2 pypi_0 pypi matplotlib-inline 0.1.7 pypi_0 pypi memoir 0.0.3 pypi_0 pypi natsort 8.4.0 pypi_0 pypi ncurses 6.4 h6a678d5_0 networkx 3.3 pypi_0 pypi numpy 1.26.4 pypi_0 pypi openssl 3.0.14 h5eee18b_0 packaging 24.1 pypi_0 pypi pandas 2.2.2 pypi_0 pypi parso 0.8.4 pypi_0 pypi patsy 0.5.6 pypi_0 pypi pexpect 4.9.0 pypi_0 pypi pillow 10.4.0 pypi_0 pypi pip 24.0 py310h06a4308_0 prompt-toolkit 3.0.47 pypi_0 pypi psutil 6.0.0 pypi_0 pypi ptyprocess 0.7.0 pypi_0 pypi pure-eval 0.2.2 pypi_0 pypi pygments 2.18.0 pypi_0 pypi pyparsing 3.1.2 pypi_0 pypi python 3.10.14 h955ad1f_1 python-dateutil 2.9.0.post0 pypi_0 pypi pytz 2024.1 pypi_0 pypi pyvis 0.3.1 pypi_0 pypi pywgcna 2.0.4 pypi_0 pypi reactome2py 3.0.0 pypi_0 pypi readline 8.2 h5eee18b_0 reprit 0.9.0 pypi_0 pypi requests 2.32.3 pypi_0 pypi rsrc 0.1.3 pypi_0 pypi scikit-learn 1.5.1 pypi_0 pypi scipy 1.14.0 pypi_0 pypi seaborn 0.13.2 pypi_0 pypi setuptools 69.5.1 py310h06a4308_0 six 1.16.0 pypi_0 pypi sqlite 3.45.3 h5eee18b_0 stack-data 0.6.3 pypi_0 pypi statsmodels 0.14.2 pypi_0 pypi threadpoolctl 3.5.0 pypi_0 pypi tk 8.6.14 h39e8969_0 traitlets 5.14.3 pypi_0 pypi typing-extensions 4.12.2 pypi_0 pypi tzdata 2024.1 pypi_0 pypi urllib3 2.2.2 pypi_0 pypi wcwidth 0.2.13 pypi_0 pypi wheel 0.43.0 py310h06a4308_0 xz 5.4.6 h5eee18b_1 zlib 1.2.13 h5eee18b_1
import PyWGCNA
import os
import pandas as pd
import itertools
# Add function for custom printing with linebreaks. Just for making debugging pretty.
def customprint(*args):
linebreak = "\n------------------------------\n"
print(f"{linebreak}")
for arg in args:
print(arg)
print(f"{linebreak}")
def run_WGCNA(runname, infile, sampleinfo, testset, speciesname='Asparagopsis taxiformis'):
customprint(f"Starting WGCNA run for {runname}")
geneExp = infile
outdir = f'output_{runname}'
pyw = PyWGCNA.WGCNA(name = runname,
species = speciesname,
geneExpPath = geneExp,
outputPath = f'{outdir}/',
save=True)
# Single command to do both .preprocess() and .findModules()
pyw.preprocess()
pyw.findModules()
#pyw.runWGCNA()
# Update sample information using the sampleInfo csv
pyw.updateSampleInfo(path=sampleinfo, sep=',')
# Add colour for metadata. WAY TOO MANY COLOURS - AAARGH!
# Lets hope that having one set that goes across different runs works fine
pyw.setMetadataColor('Culture', {'Cultured': 'lightgrey',
'Wild': 'dimgrey'})
# Removed unnecessary code details here
pyw.analyseWGCNA()
def save_dataframes():
datExpr_var = os.path.join(outdir, 'datExpr.var.csv')
datExpr_obs = os.path.join(outdir, "datExpr.obs.csv")
geneExpr = os.path.join(outdir, "geneExpr.csv")
datExpr_var_df = pyw.datExpr.var
datExpr_var_df.to_csv(datExpr_var)
try:
datExpr_obs_df = pyw.datExpr.obs.to_df()
datExpr_obs_df.to_csv(datExpr_obs)
except:
datExpr_obs_df = pyw.datExpr.obs
datExpr_obs_df.to_csv(datExpr_obs)
try:
geneExpr_df = pyw.datExpr.var.to_df()
geneExpr_df.to_csv(geneExpr)
except:
geneExpr_df = pyw.datExpr.var
geneExpr_df.to_csv(geneExpr)
# Soft power
sft = pyw.sft
sft.to_csv(os.path.join(outdir, 'soft_power.csv'))
# Adjacency matrix
adj = pyw.adjacency
adj.to_csv(os.path.join(outdir, 'adjacency.csv'))
# Topological overlap matrix
tom = pyw.TOM
tom.to_csv(os.path.join(outdir, 'topological_overlap_matrix.csv'))
try:
save_dataframes()
except:
customprint("Save_dataframes broke")
def modulelist():
modules = pyw.datExpr.var.moduleColors.unique().tolist()
#print(f"Modules: {modules}")
with open(os.path.join(outdir, "modules.txt"), "w") as mods:
mods.write("----------\nModules\n----------\n")
for module in modules:
mods.write(f"{module}\n")
return modules
modules = modulelist()
# Network analysis. This will generate a HTML document.
try:
pyw.CoexpressionModulePlot(modules=modules, numGenes=10, numConnections=200, minTOM=0, file_name=f"network.html")
except:
customprint(f"Failed coexpression module plot")
hubs = []
hub_reps = 50
def modulehubs(reps):
for module in modules:
hub = pyw.top_n_hub_genes(moduleName=module, n=reps)
hub.to_csv(os.path.join(outdir, f"top_{reps}_hub_genes_{module}.csv"))
hubs.append(hub)
try:
modulehubs(hub_reps)
except:
customprint(f"Failed modulehubs")
all_hubs = pd.concat(hubs)
all_hubs.to_csv(os.path.join(outdir, f"top_{hub_reps}_hubs_all.csv"))
# Save the run as the name defined in runname
pyw.saveWGCNA()
customprint(f"Finished WGCNA run for {runname}")
# run_WGCNA('runname', 'expressionData_{something}.csv', 'data/sampleInfoSplit.csv', 'Culturing/Temperature/Density/Light/Nutrient')
samplefile = 'data/sampleInfoSplit.csv'
run_WGCNA('Culturing', 'data/no_Nulls_expressionData_Culturing.csv', samplefile, 'Culturing')
run_WGCNA('Density', 'data/no_Nulls_expressionData_Density.csv', samplefile, 'Density')
# ... more runs here
Edited - just removed some unnecessary details from the code.
Hi @Lachiemckbioinfo
I update the package to be compatible with the latest version of matplotlib.
I will release a new version by the end of this week and you can test it out again.
Thank you for your help.
Hi @Lachiemckbioinfo
Please install the latest version and let me know if you encounter any issues.
Hi, Hoping I'm not missing something obvious. I've been persistently getting an error when running analyseWGCNA(), and happens every time I run it. I've tried both with my data, and with example data in case something was going wrong in the data.
Error traceback
My code
Program versions
I'll list the program versions of relevant programs installed in the environment. It's not all of them, so if you need to check another program, I can provide the data. anndata 0.10.7 asttokens 2.4.1 biomart 0.9.2 matplotlib 3.9.0 matplotlib-inline 0.1.7 networkx 3.2.1 numpy 2.0.0 PyWGCNA 2.0.4 pandas 2.2.2 reactome2py 3.0.0 requests 2.32.3 scikit-learn 1.5.0 scipy 1.13.1 seaborn 0.13.2
Thanks, Lachlan