KarchinLab / 2020plus

Classifies genes as an oncogene, tumor suppressor gene, or as a non-driver gene by using Random Forests
http://2020plus.readthedocs.org
Apache License 2.0
49 stars 17 forks source link

No such file or directory in 'feature file' #18

Open a00101 opened 4 years ago

a00101 commented 4 years ago

I got error below. Can you help me out?

Thanks.

python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary2.txt --tsg-test output_bladder/simulated_summary/tsg_sim2.txt -og-test output_bladder/simulated_summary/oncogene_sim2.txt   -o output_bladder/simulated_summary/simulated_features2.txt
python: can't open file 'features': [Errno 2] No such file or directory
python: can't open file 'features': [Errno 2] No such file or directory
python: can't open file 'features': [Errno 2] No such file or directory
python: can't open file 'features': [Errno 2] No such file or directory
python: can't open file 'features': [Errno 2] No such file or directory
python: can't open file 'features': [Errno 2] No such file or directory
python: can't open file 'features': [Errno 2] No such file or directory
Error in job simFeatures while creating output file output_bladder/simulated_summary/simulated_features8.txt.
Error in job simFeatures while creating output file output_bladder/simulated_summary/simulated_features7.txt.
Error in job simFeatures while creating output file output_bladder/simulated_summary/simulated_features4.txt.
Error in job simFeatures while creating output file output_bladder/simulated_summary/simulated_features6.txt.
Error in job simFeatures while creating output file output_bladder/simulated_summary/simulated_features1.txt.
Error in job simFeatures while creating output file output_bladder/simulated_summary/simulated_features2.txt.
Error in job simFeatures while creating output file output_bladder/simulated_summary/simulated_features10.txt.
RuleException:
CalledProcessError in line 202 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command 'python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary6.txt --tsg-test output_bladder/simulated_summary/tsg_sim6.txt -og-test output_bladder/simulated_summary/oncogene_sim6.txt   -o output_bladder/simulated_summary/simulated_features6.txt' returned non-zero exit status 2.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 202, in __rule_simFeatures
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
RuleException:
CalledProcessError in line 202 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command 'python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary10.txt --tsg-test output_bladder/simulated_summary/tsg_sim10.txt -og-test output_bladder/simulated_summary/oncogene_sim10.txt   -o output_bladder/simulated_summary/simulated_features10.txt' returned non-zero exit status 2.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 202, in __rule_simFeatures
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
RuleException:
CalledProcessError in line 202 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command 'python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary7.txt --tsg-test output_bladder/simulated_summary/tsg_sim7.txt -og-test output_bladder/simulated_summary/oncogene_sim7.txt   -o output_bladder/simulated_summary/simulated_features7.txt' returned non-zero exit status 2.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 202, in __rule_simFeatures
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
RuleException:
CalledProcessError in line 202 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command 'python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary8.txt --tsg-test output_bladder/simulated_summary/tsg_sim8.txt -og-test output_bladder/simulated_summary/oncogene_sim8.txt   -o output_bladder/simulated_summary/simulated_features8.txt' returned non-zero exit status 2.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 202, in __rule_simFeatures
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
RuleException:
CalledProcessError in line 202 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command 'python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary2.txt --tsg-test output_bladder/simulated_summary/tsg_sim2.txt -og-test output_bladder/simulated_summary/oncogene_sim2.txt   -o output_bladder/simulated_summary/simulated_features2.txt' returned non-zero exit status 2.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 202, in __rule_simFeatures
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
RuleException:
CalledProcessError in line 202 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command 'python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary1.txt --tsg-test output_bladder/simulated_summary/tsg_sim1.txt -og-test output_bladder/simulated_summary/oncogene_sim1.txt   -o output_bladder/simulated_summary/simulated_features1.txt' returned non-zero exit status 2.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 202, in __rule_simFeatures
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
RuleException:
CalledProcessError in line 202 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command 'python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary4.txt --tsg-test output_bladder/simulated_summary/tsg_sim4.txt -og-test output_bladder/simulated_summary/oncogene_sim4.txt   -o output_bladder/simulated_summary/simulated_features4.txt' returned non-zero exit status 2.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 202, in __rule_simFeatures
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
ctokheim commented 4 years ago

It's most likely that the 2020plus.py script is not found on your PATH. Please see: https://2020plus.readthedocs.io/en/latest/installation.html#check-your-path-variable .

a00101 commented 4 years ago

Thanks. it works partially. But another error occurred

rule predict_test:
    input: data/2020plus_10k.Rdata, output_bladder/features.txt, output_bladder/simulated_summary/simulated_features.txt
    output: output_bladder/pretrained_output/results/r_random_forest_prediction.txt
    jobid: 1

        python `which 2020plus.py` --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_bladder/simulated_null_dist.txt --features output_bladder/simulated_summary/simulated_features.txt --simulated --cv
        python `which 2020plus.py` --out-dir output_bladder/pretrained_output --log-level=INFO classify -n 200 --trained-classifier data/2020plus_10k.Rdata -d .7 -o 1.0 --features output_bladder/features.txt --null-distribution output_bladder/simulated_null_dist.txt --random-seed 71 --cv

Version: 1.2.3
Command: /addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_bladder/simulated_null_dist.txt --features output_bladder/simulated_summary/simulated_features.txt --simulated --cv
Running Random forest . . .
Type: <class 'ValueError'>
Exception: Buffer for this type not yet supported.
Traceback:
   File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py", line 275, in <module>
    args.func()  # run function corresponding to user's command
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py", line 37, in _classify
    src.classify.python.classifier.main(opts)  # run code
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/classifier.py", line 180, in main
    seed=cli_opts['random_seed'])
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/r_random_forest_clf.py", line 314, in __init__
    other_sample_ratio=other_sample_ratio)
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/r_random_forest_clf.py", line 40, in __init__
    ro.r("suppressPackageStartupMessages(library(randomForest))")  # load randomForest library
  File "/root/anaconda2/envs/2020plus/lib/python3.6/site-packages/rpy2/robjects/__init__.py", line 353, in __call__
    return conversion.ri2py(res)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/functools.py", line 807, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/site-packages/rpy2/robjects/pandas2ri.py", line 139, in ri2py_vector
    res = numpy2ri.ri2py(obj)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/functools.py", line 807, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/site-packages/rpy2/robjects/numpy2ri.py", line 159, in ri2py_sexp
    res = numpy.asarray(obj)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/site-packages/numpy/core/_asarray.py", line 85, in asarray
    return array(a, dtype, copy=False, order=order)

****************************************
AN ERROR HAS OCCURRED: check the log file
****************************************
Error in job predict_test while creating output file output_bladder/pretrained_output/results/r_random_forest_prediction.txt.
RuleException:
CalledProcessError in line 342 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command '
        python `which 2020plus.py` --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_bladder/simulated_null_dist.txt --features output_bladder/simulated_summary/simulated_features.txt --simulated --cv
        python `which 2020plus.py` --out-dir output_bladder/pretrained_output --log-level=INFO classify -n 200 --trained-classifier data/2020plus_10k.Rdata -d .7 -o 1.0 --features output_bladder/features.txt --null-distribution output_bladder/simulated_null_dist.txt --random-seed 71 --cv
        ' returned non-zero exit status 1.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 342, in __rule_predict_test
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message

below is my package version each.

scipy=1.2.1 and pandas=0.25.3 because - logsum error and other errors

ctokheim commented 4 years ago

I think this might be a version-specific numpy/rpy2 problem: https://stackoverflow.com/questions/58561333/r-magic-input-argument-not-working-properly-in-jupyter-notebook

ctokheim commented 4 years ago

I'm also finding that it might be better to stick with pandas version that is less than 1.0.0.

a00101 commented 4 years ago

Unfortunately I got another error with pandas==0.25.3

Version: 1.2.3
Command: /addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --nul
Running Random forest . . .
Type: <class 'AttributeError'>
Exception: module 'rpy2.robjects.pandas2ri' has no attribute 'py2ri'
Traceback:
   File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py", line 275, in <module>
    args.func()  # run function corresponding to user's command
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py", line 37, in _classify
    src.classify.python.classifier.main(opts)  # run code
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/classifier.py", line 190, in main
    result_df = trained_rand_forest_pred(rrclf, df, None, null_pvals, is_cv)
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/classifier.py", line 104, in trained_rand_forest_pred
    onco_prob, tsg_prob, other_prob = clf.predict_cv()
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/generic_classifier.py", line 139, in predict_cv
    tmp_prob = self.clf.predict_proba(test_feat)
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/r_random_forest_clf.py", line 272, in predict_proba
    r_xtest = pandas2ri.py2ri(xtest)

****************************************
AN ERROR HAS OCCURRED: check the log file
****************************************
Error in job predict_test while creating output file output_lung/pretrained_output/results/r_random_forest_prediction.txt.
RuleException:
CalledProcessError in line 342 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command '
        python `which 2020plus.py` --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_lung/simulated_null_dist.
        python `which 2020plus.py` --out-dir output_lung/pretrained_output --log-level=INFO classify -n 200 --trained-classifier data/2020plus_10k.Rdata -d .7
        ' returned non-zero exit status 1.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 342, in __rule_predict_test
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
ctokheim commented 4 years ago

Can you check your rpy2 version? I did not get this error in rpy2 version 2.9.4. I suspect rpy2 version 3 or higher result in this error.

a00101 commented 4 years ago

rpy2==2.9.4 pandas==0.25.3

I got error again. I'm too tired to fix numerous errors if I thank you for your generous support. Can you give me docker image?


Version: 1.2.3
Command: /addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_lung/simulated_null_dist.txt --features output_lung/simulated_summary/simulated_features.txt --simulated --cv
Running Random forest . . .
Type: <class 'ValueError'>
Exception: Buffer for this type not yet supported.
Traceback:
   File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py", line 275, in <module>
    args.func()  # run function corresponding to user's command
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/2020plus.py", line 37, in _classify
    src.classify.python.classifier.main(opts)  # run code
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/classifier.py", line 180, in main
    seed=cli_opts['random_seed'])
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/r_random_forest_clf.py", line 314, in __init__
    other_sample_ratio=other_sample_ratio)
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/src/classify/python/r_random_forest_clf.py", line 40, in __init__
    ro.r("suppressPackageStartupMessages(library(randomForest))")  # load randomForest library
  File "/root/anaconda2/envs/2020plus/lib/python3.6/site-packages/rpy2/robjects/__init__.py", line 353, in __call__
    return conversion.ri2py(res)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/functools.py", line 807, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/site-packages/rpy2/robjects/pandas2ri.py", line 139, in ri2py_vector
    res = numpy2ri.ri2py(obj)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/functools.py", line 807, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/site-packages/rpy2/robjects/numpy2ri.py", line 159, in ri2py_sexp
    res = numpy.asarray(obj)
  File "/root/anaconda2/envs/2020plus/lib/python3.6/site-packages/numpy/core/_asarray.py", line 85, in asarray
    return array(a, dtype, copy=False, order=order)

****************************************
AN ERROR HAS OCCURRED: check the log file
****************************************
Error in job predict_test while creating output file output_lung/pretrained_output/results/r_random_forest_prediction.txt.
RuleException:
CalledProcessError in line 342 of /addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile:
Command '
        python `which 2020plus.py` --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_lung/simulated_null_dist.txt --features output_lung/simulated_summary/simulated_features.txt --simulated --cv
        python `which 2020plus.py` --out-dir output_lung/pretrained_output --log-level=INFO classify -n 200 --trained-classifier data/2020plus_10k.Rdata -d .7 -o 1.0 --features output_lung/features.txt --null-distribution output_lung/simulated_null_dist.txt --random-seed 71 --cv
        ' returned non-zero exit status 1.
  File "/addData01/01_Program_to_install/63.2020plus/2020plus-master/Snakefile", line 342, in __rule_predict_test
  File "/root/anaconda2/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
ctokheim commented 4 years ago

There currently are no docker images. You might want to see a related issue about installing the correct package versions: https://github.com/KarchinLab/2020plus/issues/13

stroke1989 commented 2 years ago

I got the same error 55 of 57 steps (96%) done rule predict_test: input: data/2020plus_10k.Rdata, output_bladder/features.txt, output_bladder/simulated_summary/simulated_features.txt output: output_bladder/pretrained_output/results/r_random_forest_prediction.txt

    python `which 2020plus.py` --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_bladder/simulated_null_dist.txt --features output_bladder/simulated_summary/simulated_features.txt --simulated --cv
    python `which 2020plus.py` --out-dir output_bladder/pretrained_output --log-level=INFO classify -n 200 --trained-classifier data/2020plus_10k.Rdata -d .7 -o 1.0 --features output_bladder/features.txt --null-distribution output_bladder/simulated_null_dist.txt --random-seed 71 --cv

Version: 1.2.3 Command: /home/ug0416/software/2020plus/2020plus.py --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_bladder/simulated_null_dist.txt --features output_bladder/simulated_summary/simulated_features.txt --simulated --cv Running Random forest . . . Type: <class 'ValueError'> Exception: Buffer for this type not yet supported. Traceback: File "/home/ug0416/software/2020plus/2020plus.py", line 275, in args.func() # run function corresponding to user's command File "/home/ug0416/software/2020plus/2020plus.py", line 37, in _classify src.classify.python.classifier.main(opts) # run code File "/home/ug0416/software/2020plus/src/classify/python/classifier.py", line 184, in main rrclf.clf.load_cv(cli_opts['trained_classifier']) File "/home/ug0416/software/2020plus/src/classify/python/r_random_forest_clf.py", line 176, in load_cv self.cv_folds = com.convert_robj(ro.r["cvFoldDf"]) File "/home/ug0416/.conda/envs/2020plus/lib/python3.5/site-packages/pandas/rpy/common.py", line 226, in convert_robj return converter(obj) File "/home/ug0416/.conda/envs/2020plus/lib/python3.5/site-packages/pandas/rpy/common.py", line 142, in _convert_DataFrame rows = np.array(rdf.rownames)


AN ERROR HAS OCCURRED: check the log file


Error in job predict_test while creating output file output_bladder/pretrained_output/results/r_random_forest_prediction.txt. RuleException: CalledProcessError in line 342 of /home/ug0416/software/2020plus/Snakefile: Command ' python which 2020plus.py --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution output_bladder/simulated_null_dist.txt --features output_bladder/simulated_summary/simulated_features.txt --simulated --cv python which 2020plus.py --out-dir output_bladder/pretrained_output --log-level=INFO classify -n 200 --trained-classifier data/2020plus_10k.Rdata -d .7 -o 1.0 --features output_bladder/features.txt --null-distribution output_bladder/simulated_null_dist.txt --random-seed 71 --cv ' returned non-zero exit status 1 File "/home/ug0416/software/2020plus/Snakefile", line 342, in __rule_predict_test File "/home/ug0416/.conda/envs/2020plus/lib/python3.5/concurrent/futures/thread.py", line 55, in run Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message

the following is modules version of my python

could someone who operate successfully provide his/her version of required modules? Appreciate!

stroke1989 commented 2 years ago

finally, I get out this mud and run sucessfully. the version of python moduls I install were as following: python 3.6.9 matplotlib 3.1.1 numpy 1.15.4 pandas 0.25.2 pip 21.3.1 probabilistic2020 1.2.3 pysam 0.15.0 PyYAML 6.0 rpy2 2.9.4 scikit-learn 0.19.1 scipy 0.19.1 snakemake 4.3.0 tzlocal 4.1