KarchinLab / 2020plus

Classifies genes as an oncogene, tumor suppressor gene, or as a non-driver gene by using Random Forests
http://2020plus.readthedocs.org
Apache License 2.0
49 stars 17 forks source link

Error in load("data/2020plus_10k.Rdata") : error reading from connection #22

Open HengqiLiu opened 2 years ago

HengqiLiu commented 2 years ago

Hi KarchinLab: This error occurred when 2020plus was about to run to completion. 2020plus_10k.Rdata has already located in the /data directory. ../../2020plus/2020plus-1.2.3/data/2020plus_10k.Rdata

`[Sat Dec 11 05:30:02 2021] rule predict_test: input: data/2020plus_10k.Rdata, 2021.12.10_2020plus_1/features.txt, 2021.12.10_2020plus_1/simulated_summary/simulated_features.txt output: 2021.12.10_2020plus_1/pretrained_output/results/r_random_forest_prediction.txt jobid: 1 resources: tmpdir=/tmp

    python `which 2020plus.py` --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution 2021.12.10_2020plus_1/simulated_null_dist.txt --fe>
    python `which 2020plus.py` --out-dir 2021.12.10_2020plus_1/pretrained_output --log-level=INFO classify -n 200 --trained-classifier data/2020plus_10k.Rdata -d .7 -o 1.0 >

Version: 1.2.3 Command: /home/data/vip13t22/wes_cancer/biosoft/2020plus/2020plus-1.2.3//2020plus.py --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution > Running Random forest . . . Type: <class 'rpy2.rinterface.RRuntimeError'> Exception: Error in load("data/2020plus_10k.Rdata") : error reading from connection

Traceback: File "/home/data/vip13t22/wes_cancer/biosoft/2020plus/2020plus-1.2.3//2020plus.py", line 275, in args.func() # run function corresponding to user's command File "/home/data/vip13t22/wes_cancer/biosoft/2020plus/2020plus-1.2.3//2020plus.py", line 37, in _classify src.classify.python.classifier.main(opts) # run code File "/home/data/vip13t22/wes_cancer/biosoft/2020plus/2020plus-1.2.3/src/classify/python/classifier.py", line 184, in main rrclf.clf.load_cv(cli_opts['trained_classifier']) File "/home/data/vip13t22/wes_cancer/biosoft/2020plus/2020plus-1.2.3/src/classify/python/r_random_forest_clf.py", line 164, in load_cv ro.r('load("{0}")'.format(path)) File "/home/data/vip13t22/miniconda3/envs/snakemake/lib/python3.6/site-packages/rpy2/robjects/init.py", line 321, in call res = self.eval(p) File "/home/data/vip13t22/miniconda3/envs/snakemake/lib/python3.6/site-packages/rpy2/robjects/functions.py", line 178, in call return super(SignatureTranslatedFunction, self).call(*args, *kwargs) File "/home/data/vip13t22/miniconda3/envs/snakemake/lib/python3.6/site-packages/rpy2/robjects/functions.py", line 106, in call res = super(Function, self).call(new_args, **new_kwargs)


AN ERROR HAS OCCURRED: check the log file


[Sat Dec 11 05:30:05 2021] Error in rule predict_test: jobid: 1 output: 2021.12.10_2020plus_1/pretrained_output/results/r_random_forest_prediction.txt shell:

    python `which 2020plus.py` --log-level=INFO classify --trained-classifier data/2020plus_10k.Rdata --null-distribution 2021.12.10_2020plus_1/simulated_null_dist.txt --fe>
    python `which 2020plus.py` --out-dir 2021.12.10_2020plus_1/pretrained_output --log-level=INFO classify -n 200 --trained-classifier data/2020plus_10k.Rdata -d .7 -o 1.0 >

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /home/data/vip13t22/wes_cancer/biosoft/2020plus/2020plus-1.2.3/.snakemake/log/2021-12-10T142639.085265.snakemake.log`

ctokheim commented 9 months ago

This seems like it may be a path error to where the 2020plus_10k.Rdata is located, and therefore R throws an error. I suspect if you put the full path to the data directory in the config file it likely will fix your problem. Specifically, change the following line "data_dir: data/" to "data_dir: /your/full/path/data/" in "config.yaml". If you used ".." in the path for the config file, it is possible that python is reading that path correctly but R may be throwing an error (as suggested by your above error message).