jason-weirather / GSVA

A command-line interface and python module for R's GSVA bioconductor package
Apache License 2.0
13 stars 6 forks source link

FileNotFoundError: File b'/tmp/weirathe.650js0ci/pathways.csv' does not exist #6

Open qiuju-Zhang opened 2 years ago

qiuju-Zhang commented 2 years ago

Hello, Whether I run GSVA via Docker or Python, I get the following error(I was able to run load and run GSVA from R):

image

I think the input data should be fine. From the error message, it seems that one of the files generated during the run could not be found, thus causing the error.

zhenyuh19 commented 1 year ago

This bug occurs in the built-in gsva.r script, and the parallel_type parameter can be turned off to solve the problem

zhenyuh19 commented 1 year ago

This bug occurs in the built-in gsva.r script, and the parallel_type parameter can be turned off to solve the problem

nicolas-zimmermann commented 1 year ago

@zhenyuh19

This bug occurs in the built-in gsva.r script, and the parallel_type parameter can be turned off to solve the problem

Dear zhenyuh19, I've had the same mistake and I've been going back and forth in files trying to turn off this parameter but I can't seem to make it work. How do you turn off this parameter ? Thank you in advance. Best,

zhenyuh19 commented 11 months ago

@zhenyuh19

This bug occurs in the built-in gsva.r script, and the parallel_type parameter can be turned off to solve the problem

Dear zhenyuh19, I've had the same mistake and I've been going back and forth in files trying to turn off this parameter but I can't seem to make it work. How do you turn off this parameter ? Thank you in advance. Best,

You can download the revised Python package from the following links: https://github.com/zhenyuh19/Tools/blob/main/GSVA_HCY.tar.gz.

Follow the steps below:


import os,sys
import pandas as pd
sys.path.append("/home/hcy/repository/Genesets.Estimate/GSVA/python")
from GSVA_HCY import gsva, gmt_to_dataframe
#from GSVA import gsva, gmt_to_dataframe

expression_infile = sys.argv[1]
gmt_infile = sys.argv[2]

# Step 1: Load your gene expression data
# The data should be in the form of a DataFrame where rows represent genes and columns represent samples.
# Replace 'your_gene_expression_data.csv' with your actual data file.
expression_df = pd.read_csv(expression_infile, index_col=0, sep="\t")

# Step 2: Load the gene sets
# You'll need gene sets in the form of a GMT file or a list of gene sets represented as lists or sets of gene symbols.
# Replace 'your_gene_sets.gmt' with the path to your gene sets file or create a list of gene sets.
gene_sets_file = gmt_infile
genesets_df = gmt_to_dataframe(gmt_infile)
print(genesets_df)

#pathways_df = gsva(expression_df, genesets_df, outdir="/home/hcy/repository/Genesets.Estimate/ssGSEA-py", method="ssgsea", prefix="test")
pathways_df = gsva(expression_df, genesets_df, method="ssgsea", kcdf="Gaussian")
print(pathways_df)